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GLYCOSXDASE ENZYMES 

This application is a continuation-in-part of pending 
patent application 08/583,787 filed January 11, 1996. 

This invention relates to newly identified 
polynucleotides, polypeptides encoded by such 
polynucleotides , the use of such polynucleotides and 
polypeptides, as well as the production and isolation of such 
polynucleotides and polypeptides. More particularly, the 
polynucleotides and polypeptides of the present invention has 
been putatively identified as glucosidases , a-galactosidases , 
/? - galactos idases , S-mannos idases , S-mannanases , 
endoglucanases , and pullalanases . 

The glycosidic bond of 0-galactosides can be cleaved by 
different classes of enzymes: (i) phospho-/3- galactos idases 
(EC3.2.1.85) are specific for a phosphorylated substrate 
generated via phosphoenolpyruvate phosphotransferase system 
(PTS) -dependent uptake; (ii) typical 0 -galactos idase6 (EC 
3.2.1. 23 ) , represented by the Escherichia coli LacZ enzyme , 
which are relatively specific for (3 -galactos ides; and (iii) 
/3- glucosidases (EC 3.2.1.21) such as the enzymes of 
Agrohacterium faecalis , Clostridium thermocelluin, Pyrococcus 
furioBUG or Sulfolobus solfataricus (Day, A.G. and Withers, 
S.G., (1986) Purification and characterization of a 0- 
glucosidase f rom Alcali genes faecalis. Can, J. Biochem. Cell. 
Biol. 64, 914-922; Kengen, S.W.M., et al . (1993) Eur. J. 
Biochem., 213, 305-312; Ait, N. , Cruezet, N. and Cattaneo, J. 
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(1982) Properties of /?-glucosidase purified from Clostridium 
thermocellum. J. Gen. Microbiol. 128, 569-577; Grogan, D.W. 
(1991) Evidence that /J-galactosidase of Sulfolobus 
solfataricus is only one of several activities of a 
thermostable 0-D-glycodiase . Appl . Environ. Microbiol. 57, 
1644-1649) . Members of the latter group, although highly 
specific with respect to the /3-anomeric configuration of the 
glycosidic linkage, often display a rather relaxed substrate 
specificity and hydrolyse /3-glucosides as well as jS-fucosides 
and 0-galactosides . 

Generally, a-galactosidases are enzymes that catalyze 
the hydrolysis of galactose groups on a polysaccaride 
backbone or hydrolyze the cleavage of di- or oligosaccharides 
comprising galactose. 

Generally, fi-mannanases are enzymes that catalyze the 
hydrolysis of mannose groups internally on a polysaccaride 
backbone or hydrolyze the cleavage of di- or 
oligosaccaharides comprising mannose groups. S-mannosidases 
hydrolyze non- reducing, terminal mannose residues on a 
mannose -containing polysaccharide and the cleavage of di- or 
oligosaccaharides comprising mannose groups . 

Guar gum is a branched galactomannan polysaccharide 
composed of 0-1,4 linked mannose backbone with a- 1,6 linked 
galactose side chains . The enzymes required for the 
degradation of guar are 0-mannanase, 0-mannosidase and of- 
galactosidase . j3-mannanase hydrolyses the mannose backbone 
internally and j3-mannosidase hydrolyses non-reducing, 
terminal mannose residues. a-galactosidase hydrolyses or- 
linked galactose groups , 

Galactomannan polysaccharides and the enzymes that 
degrade them have a variety of applications. Guar i6 
commonly used as a thickening agent in food and is utilized 
in hydraulic fracturing in oil and gas recovery. 
Consequently, galactomannanases are industrially relevant for 
the degradation and modification of guar. Furthermore, a 
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need exists for thermostable galactomannases that are active 
in extreme conditions associated with drilling and well 
stimulation. 

There are other applications for these enzymes in 
various industries, such as in the beet sugar industry. 20- 
30% of the domestic U.S. sucrose consumption is sucrose from 
sugar beets . Raw beet sugar can contain a small amount of 
raffinose when the sugar beets are stored before processing 
and rotting begins to set in. Raffinose inhibits the 
crystallization of sucrose and also constitutes a hidden 
quantity of sucrose. Thus, there is merit to eliminating 
raffinose from raw beet sugar. a-Galactosidase has also been 
used as a digestive aid to break down raffinose, stachyose, 
and verbascose in such foods as beans and other gassy foods . 

/3-Galactosidases which are active and stable at high 
temperatures appear to be superior enzymes for the production 
of lactose -free dietary milk products (Chaplin, M.F. and 
Bucke, C. (1990) In: Enzyme Technology, pp. 159-160, 
Cambridge University Press, Cambridge, UK) - Also, several 
studies have demonstrated the applicability of 0- 
galactosidases to the enzymatic synthesis of oligosaccharides 
via transglycosylation reactions (Nilsson, K.G.I. (1988) 
Enzymatic synthesis of oligosaccharides. Trends Biotechnol . 

6, 156-264; Cote , G.L. and Tao, B.Y. (1990) Oligosaccharide 
synthesis by enzymatic transglycosylation. Glycocon jugate J. 

7, 145-162) . Despite the commercial potential, only a few 0- 
galactosidases of thermophilics have been characterized so 
far. Two genes reported are 0 -galactoside- cleaving enzymes 
of the hyperthermophilic bacterium Thexmotoga maritima, one 
of the most thermophilic organotrophic eubacteria described 
to date (Huber, R. , L.angworthy, T.A., Konig, H. , Thomm, M. , 
Woese, C.R., Sleytr, U.B. and Stetter, K.O. (1986) T. martima 
sp. nov. represents a new genus of unique extremely 
thermophilic eubacteria growing up to 90°C, Arch. Microbiol. 
144, 324-333) one of the most thermophilic organotrophic 
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eubacteria described to date. The gene products have been 
identified as a /3-galactosidase and a /3-glucosidaee . 

Pullulanase is well known as a debranching enzyme of 
pullulan and starch. The enzyme hydrolyzes a-1 , 6 -glucosidic 
linkages on these polymers. Starch degradation for th 
eproduction or sweeteners (glucose or maltose) is a very 
important industrial application of this enzyme . The 
degradation of starch is developed in two stages. The first 
stage involves the liquefaction of the substrate with a- 
amylase, and the second stage, or saccharif ication stage, is 
performed by £- amylase with pullalanase added as a 
debranching enzyme, to obtain better yields. 

Endoglucanases can be used in a variety of industrial 
applications. For instance, the endoglucanases of the 
present invention can hydrolyze the internal fi-l , 4-glycosidic 
bonds in cellulose, which may be used for the conversion of 
plant biomass into fuels and chemicals. Endoglucanases also 
have applications in detergent formulations, the textile 
industry , in animal feed , in waste treatment , and in the 
fruit juice and brewing industry for th eclarif ication and 
extraction of juices. 

The polynucleotides and polypeptides of the present 
invention have been identified as glucosidases , cr- 
galactosidases , 0-galactosidases , fi-mannosidases , S- 
mannanases , endoglucanases, and pullalanases as a result of 
their enzymatic activity. 

In accordance with one aspect of the present invention, 
there are provided novel enzymes, as well as active 
fragments , analogs and derivatives thereof . 

In accordance with another aspect of the present 
invention, there are provided isolated nucleic acid molecules 
encoding the enzymes of the present invention including 
mRNAs , cDNAs , genomic DMAs as well as active analogs and 
fragments of such enzymes . 
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In accordance with another aspect of the present 
invention there are provided isolated nucleic acid molecules 
encoding mature polypeptides expressed by the DNA contained 
in ATCC Deposit No. 97379. 

in accordance with yet a further aspect of the present 
invention, there is provided a process for producing such 
polypeptides by recombinant techniques comprising culturing 
recombinant prokaryotic and/or eukaryotic host cells, 
containing a nucleic acid sequence of the present invention,' 
under conditions promoting expression of said enzymes and 
subsequent recovery of said enzymes. 

In accordance with yet a further aspect of the present 
invention, there is provided a process for utilizing such 
enzymes, or polynucleotides encoding such enzymes for 
hydrolyzing lactose to galactose and glucose for use in the 
food processing industry, the pharmaceutical industry, for 
example, to treat intolerance to lactose, as a diagnostic 
reporter molecule, in com wet milling, in the fruit juice 
industry, in baking, in the textile industry and in the 
detergent industry. 

In accordance with yet a further aspect of the present 
invention, there is provided a process for utilizing such 
enzymes for hydxolyzing guar gum (a galactomannan 
polysaccharide) to remove non-reducing terminal mannose 
residues. Further polysaccharides such as galactomannan and 
the enzymes according to the invention that degrade them have 
a varitey of applications. Guar gum is commonly used as a 
thickening agent in food and also is utilized in hydraulic 
fracturing in oil and gas recovery. Consequently, mannanases 
are industrially relevant for the degradation and 
modification of guar gums. Furthermore, a need exists for 
thermostable mannases that are active in extreme conditions 
associated with drilling and well stimulation. 

In accordance with yet a further aspect of the present 
invention, there are also provided nucleic acid probes 
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comprising nucleic acid molecules of sufficient length to 
specifically hybridize to a nucleic acid sequence of the 
present invention . 

In accordance with yet a further aspect of the present 
invention, there is provided a process for utilizing such 
enzymes , or polynucleotides encoding such enzymes , for in 
vitro purposes related to scientific research, for example, 
to generate probes for identifying similar sequences which 
might encode similar enzymes from other organisms by using 
certain regions, i.e., conserved sequence regions, of the 
nucleotide sequence. 

These and other aspects of the present invention should 
be apparent to those skilled in the art from the teachings 
herein . 



The following drawings are illustrative of embodiments 
of the invention and are not meant to limit the scope of the 
invention as encompassed by the claims. 

Figure 1 is an illustration of the full-length DNA and 
corresponding deduced amino acid sequence of M11TL of the 
present invention. Sequencing was performed using a 378 
automated DNA sequencer for all sequences of the present 
invention (Applied Biosystems, Inc.) ♦ 

Figure 2 is an illustration of the full-length DNA and 
corresponding deduced amino acid sequence of OC1/4V-33B/G . 

Figure 3 is an illustration of the full-length DNA and 
corresponding deduced amino acid sequence of F1-12G. 

Figure 4 are illustrations of the full-length DNA and 
corresponding deduced amino acid sequence of 9N2-31B/G. 

Figure 5 are illustrations of the full-length DNA and 
corresponding deduced amino acid sequence of MSB8-6G . 

Figure 6 are illustrations of the full-length DNA and 
corresponding deduced amino acid sequence of AEDII12RA-18B/G. 

Figure 7 is an illustration of the full- length DNA and 
corresponding deduced amino acid sequence of GC74-22G. 



Brief Description of the Drawings 
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Figure 8 is an illustration of the full-length DNA and 
corresponding deduced amino acid sequence of VC1-7G1. 

Figure 9 is an illustration of the full-length DNA and 
corresponding deduced amino acid sequence of 3 7GP1. 

Figure 10 is an illustration of the full-length DNA and 
corresponding deduced amino acid sequence of 6GC2 . 

Figure 11 is an illustration of the full-length DNA and 
corresponding deduced amino acid sequence of 6GP2 . 

Figure 12 is an illustration of the full-length DNA and 
corresponding deduced amino acid sequence of 63GB1. 

Figure 13 is an illustration of the full-length DNA and 
corresponding deduced amino acid sequence of OC1/4V. 

Figure 14 is an illustration of the full-length DNA and 
corresponding deduced amino acid sequence of 6GP3 . 

Definitions 

The term "gene" means the segment of DNA involved in 
producing a polypeptide chain; it includes regions preceding 
and following the coding region (leader and trailer) as well 
as intervening sequences (introns) between individual coding 
segments (exone) . 

A coding sequence is "operably linked to" another coding 
sequence when RNA polymerase will transcribe the two coding 
sequences into a single mRNA, which is then translated into 
a single polypeptide having amino acids derived from both 
coding sequences . The coding sequences need not be 
contiguous to one another so long as the expressed sequences 
ultimately process to produce the desired protein. 

"Recombinant" enzymes refer to enzymes produced by 
recombinant DNA techniques; i.e., produced from cells 
transformed by an exogenous DNA construct encoding the 
desired enzyme. "Synthetic" enzymes are those prepared by 
chemical synthesis . 

A DNA "coding sequence of" or a "nucleotide sequence 
encoding" a particular enzyme, is a DNA sequence which is 
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transcribed and translated into an enzyme when placed under 
the control of appropriate regulatory sequences . 



In accordance with an aspect of the present invention, 
there are provided isolated nucleic acids (polynucleotides) 
which encode for the mature enzymes having the deduced amino 
acid sequences of Figures 1-14 <SEQ ID NOS:15-28) . 

In accordance with another aspect of the present 
invention, there are provided isolated polynucleotides 
encoding the enzymes of the present invention. The deposited 
material is a mixture of genomic clones comprising DNA 
encoding an enzyme of the present invention- Each genomic 
clone comprising the respective DNA has been inserted into a 
pBluescript vector (Stratagene, La Jolla, CA) . The deposit 
has been deposited with the American Type Culture Collection, 
12301 Parklawn Drive, Rockville, Maryland 20852, USA, on 
December 13, 1995 and assigned ATCC Deposit No. 97379. 

The deposit (s) have been made under the terms of the 
Budapest Treaty on the International Recognition of the 
deposit of micro-organisms for purposes of patent procedure. 
The strains will be irrevocably and without restriction or 
condition released to the public upon the issuance of a 
patent. These deposits are provided merely as convenience to 
those of skill in the art and are not an admission that a 
deposit be required under 35 U,S.C. §112 . The sequences of 
the polynucleotides contained in the deposited materials, as 
well as the amino acid sequences of the polypeptides encoded 
thereby, are controlling in the event of any conflict with 
any description of sequences herein. A license may be 
required to make, use or sell the deposited materials, and no 
such license is hereby granted. 

Detailed Description of the Invention 

The polynucleotides of this invention were originally 
recovered from genomic gene libraries derived from the 
following organisms: 



Summary of the Invention 
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M11TL ia a new species of Deaulfurococcus isolated from 
Diamond Pool in Yellowstone National Park . The organism 
grows optimally at 85-88°C, pH 7.0 in a low salt medium 
containing yeast extract, peptone, and gelatin as substrates 
with a N 2 /C0 2 gas phase. 

0C1/4V is from the genus Thermotoga. The organism was 
isolated from Yellowstone National Park, It grows optimally 
at 75 °C in a low salt medium with cellulose as a substrate 
and N 2 in gas phase . 

Pyrococcus furioaus VC1 is from the genus Pyracoccus . 
VCl was isolated from Vulcano, Italy. It grows optimally at 
lOO^C in a high salt medium (marine) containing elemental 
sulfur, yeast extract, peptone and starch as substrates and 
N 2 in gas phase . 

Staphylothermus marlnus Fl is a from the genus 
Staphylothennus. Fl was isolated from Vulcano, Italy. It 
grows optimally at 85°C, pH 6.5 in high salt medium (marine) 
containing elemental sulfur and yeast extract as substrates 
and N 2 in gas phase. 

Thermococcus 9N-2 is from the genus Thermococcus 9N-2 
was isolated from diffuse vent fluid in the East Pacific 
Rise. It is a strict anaerobe that grows optimally at 87°C. 

Thermotoga maritime MSB 8 is from the genus Thermot:ogo, 
and was isolated from Vulcano, Italy. MSB 8 grows optimally 
at 85°C, pH 6.5 in a high salt medium (marine) containing 
starch and yeast extract as substrates and N 2 in gas phase. 

Thermococcus alcaliphilus AEDII12RA is from the genus 
Thermococcus. AEDII12RA grows optimally at 85°C, pH 9.5 in 
a high salt medium (marine) containing polysulfides and yeast 
extract as substrates and N 2 in gas phase. 

Thermococcus chitonophagus GC74 is from the genus 
Thermococcus. GC74 grows optimally at 85 °C, pH 6 . 0 in a high 
salt medium (marine) containing chitin, meat extract, 
elemental sulfur and yeast extract as substrates and N 2 in gas 
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phase- AEPII la grows optimally at 85°C at pH 6 . 5 in marine 
medium under anaerobic conditions. It has many substrates. 
[Add. descriptions of new organisms] 

Accordingly, the polynucleotides and enzymes encoded 
thereby are identified by the organism from which they were 
isolated, and are sometimes hereinafter referred to as 
"MllTL" (Figure 1 and SEQ ID NOS:l and 15) f t1 OCl/4V-33B/G° 
(Figure 2 and SEQ ID NOS;2 and 16), "F1-12G" (Figure 3 and 
SEQ ID N0S:3 and 17), "9N2-31B/G" (Figure 4 and SEQ ID NOS:4 
and 18), "MSB8 n (Figure 5 and SEQ ID NOS:5 and 19), 
"AEDII12RA-18B/G n (Figure 6 and SEQ ID NOS:6 and 20), "GC74- 
22G" (Figure 7 and SEQ ID NOS : 7 and 21), n VCl-7Gl" (Figure 8 
and SEQ ID NOS: 8 and 22), "37GP1" (Figure 9 and SEQ ID NOS: 
9 and 23) , "6GC2" (Figure 10 and SEQ ID NOS; 10 and 24) , 
"6GP2" (Figure 11 and SEQ ID NOS:ll and 25), "AEPII la" 
(Figure 12 and SEQ ID NOS: 12 and 26) , "OCl/4V n (Figure 13 and 
SEQ ID NOS: 13 and 27) , and n 6GP3 n (Figure 14 and SEQ ID 
NOS:28) . 

The polynucleotides and polypeptides of the present 
invention show identity at the nucleotide and protein level 
to known genes and proteins encoded thereby as shown in Table 
1. 



■ Clone ■ r. : 


• ; iiGeuie / Pro t e $jat^i£&§ : 
Clo s e st Homology $ 


;!lfiProt^nP. ■: 

;-:?:ident:i£y' ; ■ 


; : :#Nucieia-V;- 
Acid 
Identity 


M11TL-29G 


Sulf olobus 
sulf atari cus DSM 
1616/P1, /3- 
galactosidase 


51% 


55% 


OC1/4V-33B/G 


Caldocellum 
saccharolyticum , 
0-glucosidase 


52% 


57% 


S taphylothermuB 
marinuB F1-12G 


Bacillus polymyxa, 
0-galactosidase 


36% 


48% 
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ThermococcuB 


Sulf olobus 

SUlIaLdriCUfi A1LL 

49255/MT4, 0- 
galactosidase 


51% 


50% 


Thermo toga, 
maritlma. MSB 8 - 
6G 


Clostridium 
thermocellum bglB 


45% 


53% 


1 XI GZUiO CO C CU S 

AEDII12RA-18B/G 


Daciiius poj.ymyxa, 
0 -galactosidase 




A O ft- 


Thermococcus 

CUJL touopjiayus 
GC74-22G 


Sulf olobus 

oUlIatallCUS HILL 

49255/MT4, 0- 
galactosidase 


46% 


54% 


Pyrococcus 
runobus vti- 
7G1 


Sulf olobus 

SUliaUdriCUB/ Nl "4 

0 -galactosidase 


46.4% 


52.5% 


Thennotoga 
inax-itima a- 

^alaCuOS ldaSc 

(6GC2) 


Pediococcus 
pentosaceaus a- 

nra T si rtes ^ r*3 o o a 
gaXacuooAUaDc 


49% 


29% 


Thermo togra 
wa.ri fcima. S- 
mannanase 
<€GP2) 


Aspergillus 

aCUlcdLuS 


56% 


37% 


AEPII la S- 
rnaiuio s i aas e 
(63GB1) 


Sulf olobus 

oUlIaCl. alXCUB lb — 

galactosidase 


78% 


56% 


OC1/4V 

endoglucanase 
(33GP1) 


Clostridium 
thermocellum endo- 

1 , 4 -IS- 

endoglucanase 


65% 


43% 


Tiierinotog'a 
/naz*i tima 
pullalanase 
(6GP3) 


Caldocellum 
saccharolyticum ai- 
de Strom 6 
glue anohydr a 1 as e 


72 


53 


Bankia gouldi 
mix 

Endoglucanase 
(3 7GP1) 


None available 







The polynucleotides and enzymes of the present invention 
show homology to each other as shown in Table 2 . 
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Table 2 



Clone 


Gene / Protein with 
Clos est Homology 


Protein 
Identity 


Nucleic 

Acid 
Identity 


S taphyl o thermus 
marinuB F1-12G 


ThszmococcuB 
AEDII12RA-18B/G, 
0-galactosidase, 
glucosidase 


55% 


57% 


Therxnacoccus 
9N2-31B/G 


Thermococcus 
chi tonophaguB 
GC74-22G- 
glucosidase % 


74% 


66% 


Pyrococcue 
furiosue VC1- 
7G1 


Pyrococcue 
furlosus VC1-7B/G 
0-galactosidase 


46.4% 


54% 



All the clones identified in Tables 1 and 2 encode 
polypeptides which have a-glycosidase or /2-glycosidase 
activity. 

This invention, in addition to the isolated nucleic acid 
molecules encoding the enzymes of the present invention, also 
provide substantially similar sequences. Isolated nucleic 
acid sequences are substantially similar if: (i) they are 
capable of hybridizing under conditions hereinafter 
described, to the polynucleotides of SEQ ID NOS:l-8; (ii) or 
they encode DNA sequences which are degenerate to the 
polynucleotides of SEQ ID N0S:l-8. Degenerate DNA sequences 
encode the amino acid sequences of SEQ ID NOS:9-16, but have 
variations in the nucleotide coding sequences. As used 
herein, substantially similar refers to the sequences having 
similar identity to the sequences of the instant invention. 
The nucleotide sequences that are substantially the same can 
be identified by hybridization or by sequence comparison. 
Enzyme sequences that are substantially the same can be 
identified by one or more of the following: proteolytic 
digestion, gel electrophoresis and/or microsequencing. 
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One means for isolating the nucleic acid molecules 
encoding the enzymes of the present invention is to probe a 
gene library with a natural or artificially designed probe 
using art recognized procedures (see, for example: Current 
Protocols in Molecular Biology, Ausubel F.M. et ai . (EDS.) 
Green Publishing Company Assoc. and John Wiley Interscience , 
New York, 1989, 1992) . It is appreciated to one skilled in 
the art that the polynucleotides of SEQ ID NOS;l-14 or 
fragments thereof (comprising at least 12 contiguous 
nucleotides), are particularly useful probes. Other 
particular useful probes for this purpose are hybridizable 
fragments to the sequences of SEQ ID NOS:l-14 (i.e., 
comprising at least 12 contiguous nucleotides) . 

With respect to nucleic acid sequences which hybridize 
to specific nucleic acid sequences disclosed herein, 
hybridization may be carried out under conditions of reduced 
stringency, medium stringency or even stringent conditions. 
As an example of oligonucleotide hybridization f a polymer 
membrane containing immobilized denatured nucleic acids is 
first prehybridized for 30 minutes at 45 °C in a solution 
consisting of 0.9 M NaCl, 50 mM NaH 2 P0 4 , pH 7.0, 5.0 mM 
Na 2 EDTA, 0.5% SDS, 10X Denhardt's, and 0.5 mg/mL 
polyriboadenylic acid. Approximately 2 X 10 7 cpm (specific 
activity 4-9 X 10 s cpm/ug) of 32 P end- labeled oligonucleotide 
probe are then added to the solution. After 12-16 hours of 
incubation, the membrane is washed for 30 minutes at room 
temperature in IX SET (150 mM NaCl, 20 mM Tris hydrochloride, 
pH 7.8, 1 mM Na 2 EDTA) containing 0.5* SDS , followed by a 30 
minute wash in fresh IX SET at Tm 10°C for the oligo- 
nucleotide probe . The membrane is then exposed to auto- 
radiographic film for detection of hybridization signals. 

Stringent conditions means hybridization will occur only 
if there is at least 90% identity, preferably at least 95% 
identity and most preferably at least 97% identity between 
the sequences. Further, it is understood that a section of 
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a 10 0 bps sequence that is 95 bps in length has 95% identity 
with the 1090 bps sequence from which it is obtained. See J. 
Sambrook et al . , Molecular Cloning, A Laboratory Manual, 2d 
Ed., Cold Spring Harbor Laboratory (1989) which is hereby 
incorporated by reference in its entirety. Also, it is 
understood that a fragment of a 100 bps sequence that is 9 5 
bps in length has 95% identity with the 100 bps sequence from 
which it is obtained. 

As used herein, a first DNA (RNA) sequence is at least 
70% and preferably at least 80% identical to another DNA 
(RNA) sequence if there is at least 70% and preferably at 
least a 80% or 90% identity, respectively, between the bases 
of the first sequence and the bases of the another sequence, 
when properly aligned with each other, for example when 
aligned by BLASTN. 

"Identity" as the term is used herein, refers to a 
polynucleotide sequence which comprises a percentage of the 
same bases as a reference polynucleotide (SEQ ID NOS:l-8) . 
For example, a polynucleotide which is at least 90% identical 
to a reference polynucleotide, has polynucleotide bases which 
are identical in 90% of the bases which make up the reference 
polynucleotide and may have different bases in 10% of the 
bases which comprise that polynucleotide sequence . 

The present invention relates polynucleotides which 
differ from the reference polynucleotide such that the 
changes are silent changes, for example the change do not 
alter the amino acid sequence encoded by the polynucleotide . 
The present invention also relates to nucleotide changes 
which result in amino acid substitutions, additions, 
deletions, fusions and truncations in the polypeptide encoded 
by the reference polynucleotide. In a preferred aspect of 
the invention these polypeptides retain the same biological 
action as the polypeptide encoded by the reference 
polynucleotide . 
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It is also appreciated that such probes can be and are 
preferably labeled with an analytically detectable reagent to 
facilitate identification of the probe. Useful reagents 
include but are not limited to radioactivity, fluorescent 
dyes or enzymes capable of catalyzing the formation of a 
detectable product. The probes are thus useful to isolate 
complementary copies of DNA from other sources or to screen 
such sources for related sequences . 

The polynucleotides of this invention were recovered 
from genomic gene libraries from the organisms listed in 
Table 1. For example, gene libraries can be generated in the 
Lambda ZAP II cloning vector (Stratagene Cloning Systems) . 
Mass excisions can be performed on these libraries to 
generate libraries in the pBluescript phagemid. Libraries 
are thus generated and excisions performed according to the 
protocols/methods hereinafter described. 

The excision libraries are introduced into the JET. coll 
strain BW14893 F'kanlA. Expression clones are then 

identified using a high temperature filter assay. Expression 
clones encoding several glucanases and several other 
glycosidases are identified and repurified. The 
polynucleotides, and enzymes encoded thereby, of the present 
invention, yield the activities as described above. 

The coding sequences for the enzymes of the present 
invention were identified by screening the genomic DNAs 
prepared for the clones having glucosidase or galactosidase 
activity . 

An example of such an assay is a high temperature filter 
assay wherein expression clones were identified by use of 
high temperature filter assays using buffer 2 (see recipe 
below) containing 1 mg/ml of the substrate 5-bromo-4 -chloro- 
3 - indolyl-/3-D-glucopyranoside (XGLU) (Diagnostic Chemicals 
Limited or Sigma) after introducing an excision library into 
the E. coli strain BW14893 F'kanlA. Expression clones 
encoding XGLUases were identified and repurified from M11TL, 
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OC1/4V, Pyrococcus furiosus VC1, Staphylothemus marinus Fl , 
Thermococcus 9N-2, Thermotoga maritima MSB 8 , Thermococcus 
alcaliphilus AEDII12RA, and Thermococcus chitonophagus GC74 . 

2-buf fer : (referenced in Miller, J.H. (1992) A Short 
Course in Bacterial Genetics, p. 445.) 

per liter: 

Na 2 HP0 4 - 7H 2 0 1 6 . lg 

NaH 2 P0 4 -7H 2 0 5.5g 
KC1 0 . 75g 

MgS0 4 -7H 2 0 0.246g 
0 -mercaptoethanol 2 . 7ml 

Adjust pH to 7.0 

High Temperature Filter Assay 

(1) The f factor f'kan (from E. coll strain CSH118) (1) was 
introduced into the pho-pnh-lac- strain BW14893(2). 
BW13893(2) . The filamentous phage library was plated on 
the resulting strain, BW14893 F'kan. (Miller, J.H. 
(1992) A Short Course in Bacterial Genetics; Lee, K.S., 
Metcalf, et al., (1992) Evidence for two phosphonate 
degradative pathways in Enterobacter Aerogenes, J. 
Bacteriol., 174:2501-2510. 

(2) After growth on 100 ram LB plates containing 100 fig/ml 
ampicillin, 80 fig /ml nethicillin and lmM IPTG, colony 
lifts were performed using Millipore HATF membrane 
filters. 

(3) The colonies transferred to the filters were lysed with 
chloroform vapor in 150 mm glass petri dishes. 

(4) The filters were transferred to 10 0 mm glass petri 
dishes containing a piece of Whatman 3 MM filter paper 
saturated with buffer. 

(a) when testing for galactosidase activity 
(XGALase) , 3 MM paper was saturated with Z buffer 
containing 1 mg/ml XGAL (ChemBridge Corporation) . 
After transferring filter bearing lysed colonies to 
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the glass petri dish, placed dish in oven at 80- 
85°C. 

(b) when testing for glucosidase (XGLUase) , 3 MM 
paper was saturated with Z buffer containing 1 
mg/ml XGLU. After transferring filter bearing 
lysed colonies to the glass petri dish, placed dish 
in oven at 80-85°C. 
(5) 'Positives' were observed as blue spots on the filter 
membranes. Used the following filter rescue technique 
to retrieve plastnid from lysed positive colony. Used 
pasteur pipette (or glass capillary tube) to core blue 
spots on the filter membrane. Placed the small filter 
disk in an Eppendorf tube containing 20 fil water. 
Incubated the Eppendorf tube at 75 °C for 5 minutes 
followed by vortexing to elute plasmid DNA off filter. 
This DNA was transformed into electrocompetent E. coli 
cells DH10B for Thermatoga maritima MSB8-6G, 
Staphylothermus marinus Fl-12G f Thermococcus AEDII12RA- 
18B/G, Thermococcus chitonophagus GC74-22G, M11T1 and 
OC1/4V. Electrocompetent BW148S3 F'kanlA E. coll were 
used for Thermococcus 9N2-31B/G, and Pyrococcue furiosuB 
VC1-7G1. Repeated filter-lift assay on transformation 
plates to identify 'positives' . Return transformation 
plates to 37°C incubator after filter lift to regenerate 
colonies. Inoculate 3 ml LB liquid containing 100 fig/ral 
ampicillin with repurified positives and incubate at 
37°C overnight. Isolate plasmid DNA from these cultures 
and sequence plasmid insert. In some instances where 
the plates used for the initial colony lifts contained 
non-confluent colonies, a specific colony corresponding 
to a blue spot on the filter could be identified on a 
regenerated plate and repurified directly, instead of 
using the filter rescue technique. 

Another example of such an assay is a variation of the 
high temperature filter assay wherein colony- laden filters 
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are heat -killed at different temperatures (for example, 10 5 °C 
for 20 minutes) to monitor thermostability. The 3 MM paper is 
saturated with different buffers (i.e., 100 mM NaCl, 5 mM 
MgCl 2 , 100 mM Tris-Cl (pH 9.5)} to determine enzyme activity 
under different buffer conditions. 

A /3-glucosidase assay may also be employed, wherein 
Glcp/3Np is used as an artificial substrate (aryl-/3- 
glucosidase) . The increase in absorbance at 405 nm as a 
result of p-nitrophenol (pNp) liberation was followed on a 
Hitachi U-1100 spectrophotometer, equipped with a 
thermostatted cuvette holder. The assays may be performed at 
80°C or 90°C in closed 1-ml quartz cuvette. A standard 
reaction mixture contains 150 mM trisodium substrate, pH 5.0 
(at 80°C) , and 0 . 95 mM pNp derivative pNp « 0.S61 mM* 1 • cm* 1 ) . 
The reaction mixture is allowed to reach the desired 
temperature, after which the reaction is started by injecting 
an appropriate amount of enzyme (1.06 ml final volume) . 

1 U 0-glucosidase activity is defined as that amount 
required to catalyze the formation of 1.0 pmol pNp/min. D- 
cellobiose may also be used as a substrate. 

An ONPG assay for 0-galactosidase activity is described 
by Miller, J.H. (1992) A Short Course in Bacterial Genetics 
and Mill, J.H. (1992) Experiments in Molecular Genetics, the 
contents of which are hereby incorporated by reference in 

their entirety. 

A quantitative fluorometric assay for 0-galactosidase 
specific activity is described by : Youngman P., (1987) 
Plasmid Vectors for Recovering and Exploiting Tn917 
Transpositions in Bacillus and other Gram-Positive Bacteria. 
In Plasmids: A Practical approach (ed. K. Hardy) pp 79-103. 
IRL Press, Oxford. A description of the procedure can be 
found in Miller (1992) p. 75-77, the contents of which are 
incorporated by reference herein in their entirety. 

The polynucleotides of the present invention may be in 
the form of DNA which DNA includes cDNA, genomic DNA, and 
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synthetic DNA. The DNA may be double- stranded or single - 
stranded, and if single stranded may be the coding strand or 
non-coding (anti-sense) strand. The coding sequences which 
encodes the mature enzymes may be identical to the coding 
sequences shown in Figures 1-8 (SEQ ID NOS:l-8) or may be a 
different coding sequence which coding sequence, as a result 
of the redundancy or degeneracy of the genetic code p encodes 
the same mature enzymes as the DNA of Figures 1-14 (SEQ ID 
NOS: 1-14) . 

The polynucleotide which encodes for the mature enzyme 
of Figures 1-14 (SEQ ID NOS: 15-28) may include, but is not 
limited to: only the coding sequence for the mature enzyme; 
the coding sequence for the mature enzyme and additional 
coding sequence such as a leader sequence or a proprotein 
sequence; the coding sequence for the mature enzyme (and 
optionally additional coding sequence) and non-coding 
sequence, such as introns or non-coding sequence 5' and/or 3' 
of the coding sequence for the mature enzyme . 

Thus, the term "polynucleotide encoding an enzyme 
(protein) " encompasses a polynucleotide which includes only 
coding sequence for the enzyme as well as a polynucleotide 
which includes additional coding and/or non-coding sequence. 

The present invention further relates to variants of the 
hereinabove described polynucleotides which encode for 
fragments, analogs and derivatives of the enzymes having the 
deduced amino acid sequences of Figures 1-14 (SEQ ID NOS: 15- 
28) - The variant of the polynucleotide may be a naturally 
occurring allelic variant of the polynucleotide or a non- 
naturally occurring variant of the polynucleotide. 

Thus, the present invention includes polynucleotides 
encoding the same mature enzyme6 as shown in Figures 1-14 
(SEQ ID NOS: 15-28) as well as variants of such 
polynucleotides which variants encode for a fragment, 
derivative or analog of the enzymes of Figures 1-14 (SEQ ID 
NOS: 15-28). Such nucleotide variants include deletion 
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variants, substitution variants and addition or insertion 
variants . 

As hereinabove indicated, the polynucleotides may have 
a coding sequence which is a naturally occurring allelic 
variant of the coding sequences shown in Figures 1-14 (SEQ ID 
NOS:l-14). As known in the art , an allelic variant is an 
alternate form of a polynucleotide sequence which may have a 
substitution, deletion or addition of one or more 
nucleotides, which does not substantially alter the function 
of the encoded enzyme. 

Fragments of the full length gene of the present 
invention may be used as a hybridization probe for a cDNA or 
a genomic library to isolate the full length DNA and to 
isolate other DNAs which have a high sequence similarity to 
the gene or similar biological activity. Probes of this type 
preferably have at least 10, preferably at least 15, and even 
more preferably at least 30 bases and may contain, for 
example, at least 50 or more bases. The probe may also be 
used to identify a DNA clone corresponding to a full length 
transcript and a genomic clone or clones that contain the 
complete gene including regulatory and promotor regions, 
exons, and introns- An example of a screen comprises 
isolating the coding region of the gene by using the known 
DNA sequence to synthesize an oligonucleotide probe. Labeled 
oligonucleotides having a sequence complementary to that of 
t he gene of the present invention are used to screen a 
library of genomic DNA to determine which members of the 
library the probe hybridizes to. 

The present invention further relates to 
polynucleotides which hybridize to the hereinabove -described 
sequences if there is at least 70%, preferably at least 90%, 
and more preferably at least 95% identity between the 
sequences. The present invention particularly relates to 
polynucleotides which hybridize under stringent conditions to 
the hereinabove -described polynucleotides. As herein used, 
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the term "stringent conditions" means hybridization will 
occur only if there is at least 95% and preferably at least 
97% identity between the sequences. The polynucleotides 
which hybridize to the hereinabove described polynucleotides 
in a preferred embodiment encode enzymes which either retain 
substantially the same biological function or activity as the 
mature enzyme encoded by the DMA of Figures 1-14 (SEQ ID 
NOS:l-14) . 

Alternatively, the polynucleotide may have at least 15 
bases, preferably at least 3 0 bases, and more preferably at 
least 50 bases which hybridize to any part of a 
polynucleotide of the present invention and which has an 
identity thereto, as hereinabove described, and which may or 
may not retain activity. For example, such polynucleotides 
may be employed as probes for the polynucleotides of SEQ ID 
NOS: 1-14, for example, for recovery of the polynucleotide or 
as a diagnostic probe or as a PGR primer. 

Thus, the present invention is directed to 
polynucleotides having at least a 70% identity, preferably at 
least 90% identity and more preferably at least a 95% 
identity to a polynucleotide which encodes the enzymes of SEQ 
ID NOS: 15-2 8 as well as fragments thereof, which fragments 
have at least 15 bases, preferably at least 3 0 bases and most 
preferably at least 50 bases, which fragments are at least 
90% identical, preferably at least 95% identical and most 
preferably at least 97% identical under stringent conditions 
to any portion of a polynucleotide of the present invention. 

The present invention further relates to enzymes which 
have the deduced amino acid sequences of Figures 1-14 (SEQ ID 
NOS: 15-28) as well as fragments, analogs and derivatives of 
such enzyme . 

The terms "fragment," "derivative" and "analog" when 
referring to the enzymes of Figures 1-14 (SEQ ID NOS: 15 -28) 
means enzymes which retain essentially the same biological 
function or activity as such enzymes. Thus, an analog 
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includes a proprotein which can be activated by cleavage of 
the proprotein portion to produce an active mature enzyme. 

The enzymes of the present invention may be a 
recombinant enzyme, a natural enzyme or a synthetic enzyme, 
preferably a recombinant enzyme. 

The fragment, derivative or analog of the enzymes of 
Figures 1-14 (SEQ ID NOS: 15-28) may be (i) one in which one 
or more of the amino acid residues are substituted with a 
conserved or non-conserved amino acid residue {preferably a 
conserved amino acid residue) and such substituted amino acid 
residue may or may not be one encoded by the genetic code, or 
( ii ) one in which one or more of the amino acid residues 
includes a substituent group, or (iii) one in which the 
mature enzyme is fused with another compound, such as a 
compound to increase the half -life of the enzyme (for 
example, polyethylene glycol), or (iv) one in which the 
additional amino acids are fused to the mature enzyme, such 
as a leader or secretory sequence or a sequence which is 
employed for purification of the mature enzyme or a 
proprotein sequence. Such fragments, derivatives and analogs 
are deemed to be within the scope of those skilled in the art 
from the teachings herein. 

The enzymes and polynucleotides of the present invention 
are preferably provided in an isolated form, and preferably 
are purified to homogeneity. 

The term "isolated" means that the material is removed 
from its original environment (e.g., the natural environment 
if it is naturally occurring) . For example, a naturally- 
occurring polynucleotide or enzyme present in a living animal 
is not isolated, but the same polynucleotide or enzyme, 
separated from some or all of the coexisting materials in the 
natural system, is isolated. Such polynucleotides could be 
part of a vector and/ or such polynucleotides or enzymes could 
be part of a composition, and still be isolated in that such 
vector or composition is not part of its natural environment. 



-22- 



WO 97/25417 




PCT/US97/00092 



The enzymes of the present invention include the enzymes 
of SEQ ID NOS: 15-28 (in particular the mature enzyme) as well 
as enzymes which have at least 70% similarity (preferably at 
least 70% identity) to the enzymes of SEQ ID NOS: 9-16 and 
more preferably at least 90% similarity (more preferably at 
least 90% identity) to the enzymes of SEQ ID NOS: 15-2 8 and 
still more preferably at least 95% similarity (still more 
preferably at least 95% identity) to the enzymes of SEQ ID 
NOS: 9-16 and also include portions of such enzymes with such 
portion of the enzyme generally containing at least 30 amino 
acids and more preferably at least 50 amino acids. 

As known in the art "similarity" between two enzymes is 
determined by comparing the amino acid sequence and its 
conserved amino acid substitutes of one enzyme to the 
sequence of a second enzyme. 

A variant, i.e. a ■ fragment ", "analog" or "derivative" 
polypeptide, and reference polypeptide may differ in amino 
acid sequence by one or more substitutions, additions, 
deletions, fusions and truncations, which may be present in 
any combination. 

Among preferred variants are those that vary from a 
reference by conservative amino acid substitutions. Such 
substitutions are those that substitute a given amino acid in 
a polypeptide by another amino acid of like characteristics. 
Typically seen as conservative substitutions are the 
replacements, one for another, among the aliphatic amino 
acids Ala, Val, Leu and lie; interchange of the hydroxy 1 
residues Ser and Thr, exchange of the acidic residues Asp and 
Glu, substitution between the amide residues Asn and Gin, 
exchange of the basic residues Lys and Arg and replacements 
among the aromatic residues Phe, Tyr. 

Most highly preferred are variants which retain the same 
biological function and activity as the reference polypeptide 
from which it varies. 
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Fragments or portions of the enzymes of the present 
invention may be employed for producing the corresponding 
full-length enzyme by peptide synthesis; therefore, the 
fragments may be employed as intermediates for producing the 
full-length enzymes. Fragments or portions of the 

polynucleotides of the present invention may be used to 
synthesize full-length polynucleotides of the present 
invention. 

The present invention also relates to vectors which 
include polynucleotides of the present invention, host cells 
which are genetically engineered with vectors of the 
invention and the production of enzymes of the invention by 
recombinant techniques . 

Host cells are genetically engineered (transduced or 
transformed or transfected) with the vectors of this 
invention which may be, for example, a cloning vector or an 
expression vector. The vector may be, for example, in the 
form of a plasmid, a viral particle, a phage, etc. The 
engineered host cells can be cultured in conventional 
nutrient media modified as appropriate for activating 
promoters, selecting transf ormants or amplifying the genes of 
the present invention. The culture conditions, such as 
temperature, pH and the like, are those previously used with 
the host cell selected for expression, and will be apparent 
to the ordinarily skilled artisan. 

The polynucleotides of the present invention may be 
employed for producing enzymes by recombinant techniques. 
Thus, for example, the polynucleotide may be included in any 
one of a variety of expression vectors for expressing an 
enzyme. Such vectors include chromosomal, nonchromosomal and 
synthetic DNA sequences, e.g., derivatives of SV40; bacterial 
plasmids; phage DNA; baculovirus; yeast plasraids ; vectors 
derived from combinations of plasmids and phage DNA, viral 
DNA such as vaccinia, adenovirus, fowl pox virus, and 
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pseudorabies . However, any other vector may be used as long 
as it is replicable and viable in the host. 

The appropriate DNA sequence may be inserted into the 
vector by a variety of procedures. In general, the DNA 
sequence is inserted into an appropriate restriction 
endonuclease site(s) by procedures known in the art. Such 
procedures and others are deemed to be within the scope of 
those skilled in the art. 

The DNA sequence in the expression vector is operatively 
linked to an appropriate expression control sequence (s) 
(promoter) to direct mRNA synthesis. As representative 
examples of such promoters, there may be mentioned: LTR or 
SV4 0 promoter, the E. coli , lac or trr> , the phage lambda P L 
promoter and other promoters known to control expression of 
genes in prokaryotic or eukaryotic cells or their viruses. 
The expression vector also contains a ribosome binding site 
for translation initiation and a transcription terminator. 
The vector may also include appropriate sequences for 
amplifying expression. 

In addition, the expression vectors preferably contain 
one or more selectable marker genes to provide a phenotypic 
trait for selection of transformed host cells such as 
dihydrof olate reductase or neomycin resistance for eukaryotic 
cell culture, or such as tetracycline or ampicillin 
resistance in E. coli . 

The vector containing the appropriate DNA sequence as 
hereinabove described, as well as an appropriate promoter or 
control sequence, may be employed to transform an appropriate 
host to permit the host to express the protein. 

As representative examples of appropriate hosts, there 
may be mentioned: bacterial cells, such as E. coli , 
Streptomvces , Bacillus subtilis ; fungal cells, such as yeast ; 
insect cells such as Drosophila S2 and Spodoptera Sf 9 ; animal 
cells such as CHO, COS or Bowes melanoma; adenoviruses; plant 
cells, etc. The selection of an appropriate host is deemed 
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to be within the scope of those skilled in the art from the 
teachings herein. 

More particularly, the present invention also includes 
recombinant constructs comprising one or more of the 
sequences as broadly described above. The constructs 
comprise a vector, such as a plasmid or viral vector, into 
which a sequence of the invention has been inserted, in a 
forward or reverse orientation. In a preferred aspect of this 
embodiment, the construct further comprises regulatory 
sequences, including, for example, a promoter, operably 
linked to the sequence. Large numbers of suitable vectors 
and promoters are known to those of skill in the art, and are 
commercially available. The following vectors are provided 
by way of example; Bacterial: pQE70, pQESO, pQE-9 (Qiagen) ( 
pDlO, psiX174, pBluescript II KS, pNH8A, pNH16a, pNH18A, 
pNH46A (Stratagene) ; ptrc99a, pKK223-3, pKX233-3, pDR540, 
pRIT5 (Pharmacia); Eukaryotic: pSV2CAT, pOG44, pXTl, pSG 
(Stratagene) pSVK3, pBPV, pMSG, pSVL (Pharmacia). However, 
any other plasmid or vector may be used as long as they are 
replicable and viable in the host* 

Promoter regions can be selected from any desired gene 
using CAT (chloramphenicol transferase) vectors or other 
vectors with selectable markers. Two appropriate vectors are 
pKK232-8 and pCM7 . Particular named bacterial promoters 
include lad, lacZ, T3 , T7, gpt, lambda P R , P L and trp, 
Eukaryotic promoters include CMV immediate early, HSV 
thymidine kinase, early and late SV40, LTRs from retrovirus, 
and mouse metallothionein-I . Selection of the appropriate 
vector suid promoter is well within the level of ordinary 
skill in the art. 

In a further embodiment, the present invention relates 
to host cells containing the above -described constructs. The 
host cell can be a higher eukaryotic cell, such as a 
mammalian cell, or a lower eukaryotic cell, such as a yeast 
cell, or the host cell can be a prokaryotic cell, such as a 
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bacterial cell. Introduction of the construct into the host 
cell can be effected by calcium phosphate transf ection, DEAE- 
Dextran mediated transf ection f or electroporation {Davis, L. , 
Dibner, M., Battey, I., Basic Methods in Molecular Biology, 
(1986) ) . 

The constructs in host cells can be used in a 
conventional manner to produce th£ gene product encoded by 
the recombinant sequence* Alternatively, the enzymes of the 
invention can be synthetically produced by conventional 
peptide synthesizers . 

Mature proteins can be expressed in mammalian cells, 
yeast, bacteria, or other cells under the control of 
appropriate promoters • Cell-free translation systems can 
also be employed to produce such proteins using RNAs derived 
from the DNA constructs of the present invention. 
Appropriate cloning and expression vectors for use with 
prokaryotic and eukaryotic hosts are described by Sambrook, 
et al . , Molecular Cloning : A Laboratory Manual , Second 
Edition, Cold Spring Harbor, N.Y. , (1969), the disclosure of 
which is hereby incorporated by reference* 

Transcription of the DNA encoding tbe enzymes of the 
present invention by higher eukaryotes is increased by 
inserting an enhancer sequence into the vector. Enhancers 
are cis-acting elements of DNA, usually about from 10 to 300 
bp that act on a promoter to increase its transcription. 
Examples include the SV4 0 enhancer on the late side of the 
replication origin bp 100 to 270, a cytomegalovirus early 
promoter enhancer, the polyoma enhancer on the late side of 
the replication origin, and adenovirus enhancers. 

Generally, recombinant expression vectors will include 
origins of replication and selectable markers permitting 
transformation of the h06t cell, e.g., the ampicillin 
resistance gene of E. coli and S . cerevisiae TRP1 gene, and 
a promoter derived from a highly-expressed gene to direct 
transcription of a downstream structural sequence . Such 
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promoters can be derived from operone encoding glycolytic 
enzymes such as 3 -phosphoglycerate kinase (PGK) . a-factor, 
acid phosphatase, or heat shock proteins, among others. The 
heterologous structural sequence is assembled in appropriate 
phase with translation initiation and termination sequences, 
and preferably, a leader sequence capable of directing 
secretion of translated enzyme. Optionally, the heterologous 
sequence can encode a fusion enzyme including an N- terminal 
identification peptide imparting desired characteristics, 
e.g., stabilization or simplified purification of expressed 
recombinant product . 

Useful expression vectors for bacterial use are 
constructed by inserting a structural DNA sequence encoding 
a desired protein together with suitable translation 
initiation and termination signals in operable reading phase 
with a functional promoter. The vector will comprise one or 
more phenotypic selectable markers and an origin of 
replication to ensure maintenance of the vector and to, if 
desirable, provide amplification within the host. Suitable 

prokaryotic hosts for transformation include E. coli, 

R a rm 113 subtilis , fl a1mQn ella tvphimurium and various species 
within the genera Pseudomonas , Streptomyces , and 
Staphylococcus, although others may also be employed as a 

matter of choice. 

As a representative but nonlimiting example, useful 
expression vectors for bacterial use can comprise a 
selectable marker and bacterial origin of replication derived 
from commercially available plasmids comprising genetic 
elements of the well known cloning vector pBR322 (ATCC 
37017) . Such commercial vectors include, for example, 
PKK223-3 (Pharmacia Fine Chemicals, Uppsala, Sweden) and GEM1 
(Promega Biotec, Madison, WI, USA). These pBR322 "backbone- 
sections are combined with an appropriate promoter and the 
structural sequence to be expressed. 
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Following transformation of a suitable host strain and 
growth of the host strain to an appropriate cell density, the 
selected promoter is induced by appropriate means (e.g., 
temperature shift or chemical induction) and cells are 
cultured for an additional period. 

Cells are typically harvested by centrif ugation, 
disrupted by physical or chemical means, and the resulting 
crude extract retained for further purification. 

Microbial cells employed in expression of proteins can 
be disrupted by any convenient method, including freeze -thaw 
cycling, sonication, mechanical disruption, or use of cell 
lysing agents, such methods are well known to those skilled 
in the art. 

Various mammalian cell culture systems can also be 
employed to express recombinant protein. Examples of 
mammalian expression systems include the COS -7 lines of 
monkey kidney fibroblasts, described by Gluztnan, Cell, 23:175 
(1981) , and other cell lines capable of expressing a 
compatible vector, for example, the C127, 3T3 , CHO, HeLa and 
BHK cell lines. Mammalian expression vectors will comprise 
an origin of replication, a suitable promoter and enhancer, 
and also any necessary ribosome binding sites, 
polyadenylation site, splice donor and acceptor sites, 
transcriptional termination sequences, and 5' flanking 
nontranscribed sequences. DNA sequences derived from the 
SV4 0 splice, and polyadenylation sites may be used to provide 
the required nontranscribed genetic elements . 

The enzyme can be recovered and purified from 
recombinant cell cultures by methods including ammonium 
sulfate or ethanol precipitation, acid extraction, anion or 
cation exchange chromatography, phosphocellulose 
chromatography , hydrophobic interaction chromatography , 
affinity chromatography, hydroxylapatite chromatography and 
lectin chromatography. Protein refolding steps can be used, 
as necessary, in completing configuration of the mature 



-29- 



WO 97/25417 




PCT/US97/00092 



protein. Finally, high performance liquid chromatography 
(HPLC) can be employed for final purification steps. 

The enzymes of the present invention may be a naturally 
purified product, or a product of chemical synthetic 
procedures, or produced by recombinant techniques from a 
prokaryotic or eukaryotic host (for example, by bacterial, 
yeast, higher plant, insect and mammalian cells in culture) . 
Depending upon the host employed in a recombinant production 
procedure, the enzymes of the present invention may be 
glycosylated or may be non-glycosylated. Enzymes of the 
invention may or may not also include an initial methionine 
amino acid residue. 

0-galactosidase hydro lyzes lactose to galactose and 
glucose. Accordingly, the OC1/4V, 9N2-31B/G, AEDII12RA-18B/G 
and F1-12G enzymes may be employed in the food processing 
industry for the production of low lactose content milk and 
for the production of galactose or glucose from lactose 
contained in whey obtained in a large amount as a by-product 
in the production of cheese. Generally, it is desired that 
enzymes used in food processing, such as the aforementioned 
0-galactosidases, be stable at elevated temperatures to help 
prevent microbial contamination. 

These enzymes may also be employed in the pharmaceutical 
industry. The enzymes are used to treat intolerance to 
lactose. In this case, a thermostable enzyme is desired, as 
well. Thermostable /3-galactosidases also have uses in 
diagnostic applications, where they are employed as reporter 
molecules . 

Glucosidases act on soluble cellooligosaccharides from 
the non- reducing end to give glucose as the sole product. 
Glucanases (endo- and exo-) act in the depolymerization of 
cellulose, generating more non- reducing ends (endo- 
glucanases, for instance, act on internal linkages yielding 
cellobiose, glucose and cellooligosaccharides as products). 
j3-glucosidases are used in applications where glucose is the 
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desired product. Accordingly, M11TL, F1-12G, GC74-22G and 
MSB8-6G (and OC1/4V, VC1-7G1, 9N2-31B/G and AEDII12RA18B/G) 
may be employed in a wide variety of industrial applications, 
including in corn wet milling for the separation of starch 
and gluten, in the fruit industry for clarification and 
equipment maintenance, in baking for viscosity reduction, in 
the textile industry for the processing of blue jeans, and in 
the detergent industry as an additive. For these and other 
applications, thermostable enzymes are desirable. 

Antibodies generated against the enzymes corresponding 
to a sequence of the present invention can be obtained by 
direct injection of the enzymes into an animal or by 
administering the enzymes to an animal, preferably a 
nonhuman. The antibody so obtained will then bind the 
enzymes itself. In this manner, even a sequence encoding 
only a fragment of the enzymes can be used to generate 
antibodies binding the whole native enzymes . Such antibodies 
can then be used to isolate the enzyme from cells expressing 
that enzyme. 

For preparation of monoclonal antibodies , any technique 
which provides antibodies produced by continuous cell line 
cultures can be used. Examples include the hybridoma 
technique (Kohler and Milstein, 1975, Nature, 256:495-497), 
the trioma technique, the human B-cell hybridoma technique 
(Kozbor et al., 1983, Immunology Today 4:72), and the EBV- 
hybridoma technique to produce human monoclonal antibodies 
(Cole, et al., 1985, in Monoclonal Antibodies and Cancer 
Therapy, Alan R. Liss, Inc., pp. 77-96). 

Techniques described for the production of single chain 
antibodies (U.S. Patent 4,946,778) can be adapted to produce 
single chain antibodies to immunogenic enzyme products of 
this invention. Also, transgenic mice may be used to express 
humanized antibodies to immunogenic enzyme products of this 
invention. 
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Antibodies generated against the enzyme of the present 
invention may be used in screening for similar enzymes from 
other organisms and samples . Such screening techniques are 
known in the art, for example, one such screening assay is 
described in "Methods for Measuring Cellulase Activities", 
Methods in enzymology, Vol 160, pp. 87-116, which is hereby 
incorporated by reference in its entirety. 

The present invention will be further described with 
reference to the following examples; however, it is to be 
understood that the present invention is not limited to such 
examples. All parts or amounts, unless otherwise specified, 
are by weight . 

In order to facilitate understanding of the following 
examples certain frequently occurring methods and/or terms 
will be described. 

"Plasmids" are designated by a lower case p preceded 
and/or followed by capital letters and/or numbers. The 
starting plasmids herein are either commercially available, 
publicly available on an unrestricted basis, or can be 
constructed from available plasmids in accord with published 
procedures. In addition, equivalent plasmids to those 
described are known in the art and will be apparent to the 
ordinarily skilled artisan. 

"Digestion" of DNA refers to catalytic cleavage of the 
DNA with a restriction enzyme that acts only at certain 
sequences in the DNA. The various restriction enzymes used 
herein are commercially available and their reaction 
conditions, cof actors and other requirements were used as 
would be known to the ordinarily skilled artisan. For 
analytical purposes, typically 1 fig of plasmid or DNA 
fragment is used with about 2 units of enzyme in about 20 fil 
of buffer solution. For the purpose of isolating DNA 
fragments for plasmid construction, typically 5 to 50 M9 of 
DNA are digested with 20 to 250 units of enzyme in a larger 
volume. Appropriate buffers and substrate amounts for 
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particular restriction enzymes are specified by the 
manufacturer. Incubation times of about 1 hour at 37 *C are 
ordinarily used, but may vary in accordance with the 
supplier's instructions. After digestion the reaction is 
electrophoresed directly on a polyacrylamide gel to isolate 
the desired fragment. 

Size separation of the cleaved fragments is performed 
using 8 percent polyacrylamide gel described by Goeddel, D. 
et al., Nucleic Acids Res., 8:4057 (1980). 

"Oligonucleotides" refers to either a single stranded 
polydeoxynucleotide or two complementary polydeoxynucleotide 
strands which may be chemically synthesized. Such synthetic 
oligonucleotides have no 5' phosphate and thus will not 
ligate to another oligonucleotide without adding a phosphate 
with an ATP in the presence of a kinase. A synthetic 
oligonucleotide will ligate to a fragment that has not been 
dephoephorylated . 

"Ligation" refers to the process of forming 
phosphodiester bonds between two double stranded nucleic acid 
fragments (Maniatis, T. , et al.. Id., p. 146). Unless 
otherwise provided, ligation may be accomplished using known 
buffers and conditions with 10 units of T4 DNA ligase 
("ligase") per 0.5 fig of approximately equimolar amounts of 
the DNA fragments to be ligated. 

Unless otherwise stated, transformation was performed as 
described in the method of Graham, F. and Van der Eb, A., 
Virology, 52:456-457 (1973). 

Example 1 

Bacterial Expression and Purificat ion of Glvcosidase Enzymes 
DNA encoding the enzymes of the present invention, SEQ 
ID NOS:l through 8, were initially amplified from a 
pBluescript vector containing the DNA by the PCR technique 
using the primers noted herein. The amplified sequences were 
then inserted into the respective PQE vector listed beneath 



-33- 



WO 97/25417 PCT/US97/00092 

the primer sequences , and the enzyme was expressed according 
to the protocols set forth herein. The 5' and 3 ' primer 
sequences for the respective genes are as follows: 

Thermococcua AEDII12RA -18B/G 

5 ' CCGAGAATTCATTAAAGAGGAGAAATTAACTATGGTGAATG CTATGATTGTC 
(SEQ ID NO:29) 

3' CGGAAGATCTTCATAGCTCCGGAAGCCCATA (SEQ ID NO: 30) 

Vector: pQE12; and contains the following restriction enzyme 

sites 5' EcoRI and 3' Big II. 

OC1/4V-33B/G 

5 < CCGAGAATTCATTAAAGAGGAGAAATTAACTATGATAAGAAGGTCCGATTTTCC 
(SEQ ID NO: 31) 

3' CGGAAGATCTTTAAGATTTTAGAAATTCCTT (SEQ ID NO: 32) 

Vector: pQE12; and contains the following restriction enzyme 

sites 5' EcoRI and 3' Bgl II. 



xaococcuB 9N2 - 31B/G 

CCGAGAATTCATTAAAGAGGAGAAATTAACTATGCT 



Thermococcue 9N2 - 31B/G 

5' 

(SEQ ID NO: 33) 
3' CGGAGGTACCTCACCCAAGTCCGAACTTCTC (SEQ ID NO: 34) 

Vector: pQE30; and contains the following restriction enzyme 
sites 5' EcoRI and 3' KpnI. 

StaphylothermuB marlnuB Fl - 12G 

5 ' CCGAGAATTCATTAAAGAGGAGAAATTAACTATGATAAGGTTTCCTGATTAT 
(SEQ ID NO: 35) 

3' CGGAAGATCrTTATTCGAGGTTCTTTAATCC (SEQ ID NO: 36) 

Vector: pQE12; and contains the following restriction enzyme 

sites 5' EcoRI and 3' Bgl II. 



ThezmococcuB chitonophagua GC74 - 22G 

5' CCGAGAATTCATTCATTAAAGAGGAGAAATTAACTATC 

(SEQ ID NO: 37) 
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3' CGGAGGATCCCTACCCCTCCTCTAAGATCTC (SEQ ID NO: 38) 

Vector: pQE12; and contains the following restriction enzyme 

sites 5' EcoRI and 3' BamHI. 

M11TL 

5 ' AAT AAT CT AGAG CATG CAATT C C C CAAAGA CTT CATGAT AG ( SEQ ID NO : 3 9 ) 
3' AATAAAAGCTTACTGGATCAGTGTAAGATGCT (SEQ ID NO: 40) 
Vector: pQE70; and contains the following restriction enzyme 
sites 5' SphI and 3' Hind III. 

Thennotoga maritima MSB8-6G 

5 ' CCGACAATTGATTAAAGAGGAGAAATTAACTATGGAAAGGATCGATGAAATT 
(SEQ ID NO: 41) 

3' CGGAGGTACCTCATGGTTTGAATCTCTTCTC (SEQ ID NO: 42) 

Vector: pQE12; and contains the following restriction enzyme 

sites 5' EcoRI and 3' Kpnl . 

Pyrococcus furiosus VC1 - 7G1 

5 ' CCGACAATTGATTAAAGAGGAGAAATTAACTATGTTCCCTGAAAAGTTCCTT 
(SEQ ID NO: 43) 

3' CGGAGGTACCTCATCCCCTCAGCAATTCCTC (SEQ ID NO: 44) 

Vector: pQE12; and contains the following restriction enzyme 

sites 5' EcoRI and 3' Kpn I. 

Bankia gouldi endoglucanase (37GP1) 

5 ' AATAAGGATC CGTTTAGCGACG CTCGC 
(SEQ ID NO: 45) 

3 ' AATAAAAGCTTCCGGGTTGTACAGCGGTAATAGGC { SEQ ID NO : 46 ) 

Vector: pQES2; and contains the following restriction enzyme 
sites 5' Bam HI and 3' Hind III. 

Thermotoga maritima a-galactosidase (6GC2) 

5 ' TTTATTGAATTCATTAAAGAGGAGAAATTAACTATGATCTGTrGTGGAAATATTCGGAAAG 
(SEQ ID NO:47) 

3 ' TCTATAAAGCTTTCATTCTCTCTCACCCTCTTCGTAGAAG ( SEQ ID NO : 4 8 ) 
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Vector: pQET; and contains the following restriction enzyme 
sites 5' EcoRI and 3' Hind III. 

Thermotoga maritima S-mannanase (6GP2) 

5 ' TTTATTCAATTGATTAAAGAGGAGAAATTAACTATGGGGATTGGTGGCGACGAC 
(SEQ ID NO:49) 

3' TTTATTAAGCTTATCTTTTCATATTCACATACCTCC (SEQ ID NO: 50) 

Vector: pQEt ; and contains the following restriction enzyme 
sites 5' Hind III and 3' EcoRI . 
AEPII la fi-mannanase (63GB1) 

5 ' TTT ATTGAATT CATTAAAGAGGAGAAATTAACTATG CTACCAGAAGAGTTCCTATGGGG C 
(SEQ ID NO: 51) 

3' TTTATTAAGCTTCTCATCAACGGCTATGGTCTTCATTTC (SEQ ID NO: 52) 
Vector: pQEt; and contains the following restriction enzyme 
sites 5' Hind III and 3' EcoRI. 
OC1/4V endoglucanase (33GP1) 

5- AAAAAACAATTGAACTCATTAAAGAGGAGAAATTAACTA 
(SEQ ID NO: 53) 

3' TTTTTCGGATCCAATTCTTCATTTACTCTTTGCCTG (SEQ ID NO: 54) 

Vector: pQEt; and contains the following restriction enzyme 
sites 5' BamHI and 3' EcoRI. 

Thermotoga maritima pullalanase (6GP3) 

5' TTTTGGAATTCATTAAAGAGGAGAAATTAACTATGGAACTGATCATAGAAGGTTAC 
(SEQ ID NO: 55) 

3' ATAAGAAGCTTTTCACTCTCTGTACAGAACGTACGC (SEQ ID NO: 56) 

Vector: pQEt; and contains the following restriction enzyme 

sites 5' EcoRI and 3 ' Hind III. 

The restriction enzyme sites indicated correspond to the 
restriction enzyme sites on the bacterial expression vector 
indicated for the respective gene (Qiagen, Inc. Chatsworth, 
CA) . The pQE vector encodes antibiotic resistance (Amp') , a 
bacterial origin of replication (ori), an IPTG-regulatable 
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promoter operator (P/0) . a riboeome binding aite (BBS), a «- 
Hia tag and restriction enzyme sites. 

The PQE vector was digested with the restriction enzymes 
indicated. The amplified sequences were ligated into the 
respective pOE vector and inserted in frame with the sequence 
encoding for the RBS . The ligation mixture was then used to 
transform the *. coli strain M1S/PREP4 (Qiagen. Inc.) by 
electroporation. MlS/pREP4 contains multiple copies of the 
plasmid PBEP4. which expresses the lad repressor and also 
confers xanamycin resistance (Kan-). Transf ormants were 
identified by their ability to grow on I* plates and 
ampicillin/xanamycin resistant colonies were 
Plasmid DNA was isolated and confirmed by 
analysis. Clones containing the desired const™ ,cts were 
grown overnight (O/W in liquid culture » W media 
Supplemented with both Amp (100 ug/ml. and Kan (25 ug/mlK 
The 0/N culture was used to inoculate a large culture at a 
ratio of 1 = 100 to 1=250. The cells were grown to an optical 
density 600 (O.D.-) of between 0.4 and 0... ■ ^ 
(■Isopropyl-B-D-thiogalacto pyranoside-t was then added to a 
inal concentration of 1 IPTG induces by inactivating 

the lad repressor, clearing the p/0 leading to increased 
gene expression. Cells were grown an extra 3 to 4 hours. 
Cells were then harvested by centrif ugatxon . 

The primer sequences set out above may also be employed 
to isolate the target gene from the deposited material by 
hybridization techniques described above. 

Example 2 

t r<<->i ahi on Q f - -i-fd Clone From the LVpoHH genomic 
clones 

A clone is isolated directly by screening the 
deposited material using the oligonucleotide primers set 
forth in Example l for the particular gene desired to be 
isolated. The specific oligonucleotides are synthesized 
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using an Applied Biosystems DNA synthesizer. The 
oligonucleotides are labeled with J2 P- -ATP using T4 
polynucleotide kinase and purified according to a standard 
protocol (Maniatis et al . , Molecular Cloning: A Laboratory 
Manual, Cold Spring Harbor Press, Cold Spring, NY, 19 82). 
The deposited clones in the pBluescript vectors may be 
employed to transform bacterial hosts which are then plated 
on 1.5% agar plates to the density of 20,000-50,000 
pfu/150 mm plate. These plates are screened using Nylon 
membranes according to the standard screening protocol 
(Stratagene, 1993). Specifically, the Nylon membrane with 
denatured and fixed DNA is prehybridized in 6 x SSC, 20 mM 
NaH 2 P0 4 . 0.4%SDS. 5 x Denhardt's 500 (xg/ml denatured, 
sonicated salmon sperm DNA; and € x SSC, 0.1% SDS . After 
one hour of prehybridization, the membrane is hybridized 
with hybridization buffer 6xSSC, 20 mM NaH 2 P0 4 , 0.4%SDS, 500 
ug/ml denatured, sonicated salmon sperm DNA with lxlO 6 
cpm/ml »P-probe overnight at 42«C. The membrane is washed 
at 45-50OC with washing buffer 6 x SSC, 0.1% SDS for 20-30 
minutes dried and exposed to Kodak X-ray film overnight. 
Positive clones are isolated and purified by secondary and 
tertiary screening. The purified clone is sequenced to 
verify its identity to the primer sequence. 

Once the clone is isolated, the two oligonucleotide 
primers corresponding to the gene of interest are used to 
amplify the gene from the deposited material . A polymerase 
chain reaction is carried out in 25 M l of reaction mixture 
with 0.5 ug of the DNA of the gene of interest. The 
reaction mixture is 1.5-5 mM MgCl,, 0.01% <w/v) gelatin, 20 
HM each of dATP , dCTP, dGTP, dTTP, 25 pmol of each primer 
and 0.25 Unit of Taq polymerase. Thirty five cycles of PCR 
(denaturation at 94°C for 1 min; annealing at 55°C for 1 
min; elongation at 72«C for 1 min) are performed with the 
Perkin-Elmer Cetus automated thermal cycler. The amplified 
product is analyzed by agarose gel electrophoresis and the 
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DNA band with expected molecular weight is excised and 
purified. The PCR product is verified to be the gene of 
interest by subcloning and sequencing the DNA product. The 
ends of the newly purified genes are nucleotide sequenced 
to identify full length sequences. Complete sequencing of 
full length genes is then performed by Exonuclease III 
digestion or primer walking. 

Example 3 

grrr- ^rtina for Galactosi^ aap Activity 
Screening procedures for a-galactosidase protein 
activity may be assayed for as follows: 

Substrate plates were provided by a standard plating 
procedure. Dilute XLl-Blue MRF E coli host of (Stratagene 
Cloning Systems, La Jolla, CA) to O.D.« = 1.0 with NZY 
media. In 15 ml tubes, inoculate 200 M l diluted host cells 
with phage. Mix gently and incubate tubes at 37 -c for 15 
min. Add approximately 3.5 ml LB top agarose (0.7%) 
containing 1WM IPTG to each tube and pour onto all NYZ 
plate surface. Allow to cool and incubate at 37 *C 
overnight. The assay plates are obtained as substrate p- 
Nitrophenyl a - g alactosidase (Sigma) (200 mg/100 ml) (100 mM 
NaCl, 100 mM Potassium- Phosphate) 1% (w/v) agarose. The 
plaques are overlayed with nitrocellulose and incubated at 
4 o C for 3 0 minutes whereupon the nitrocellulose is removed 
and overlayed onto the substrate plates. The substrate 
plates are then incubated at 70 °C for 20 minutes. 

Example 4 

c^To»nina of Clones M«n n< ni.a« Activity 

A solid phase screening assay was utilized as a 
primary screening method to test clones for S-mannanase 
activity. 

A culture solution of the Y1090-E. coli host strain 
(Stratagene Cloning Systems, La Jolla, CA) was diluted to 
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O.D.«o=l-° witn NZY media - The am P lified library from 
Thermotoga maritima lambda gtll library was diluted in SM 
(phage dilution buffer) : 5 x 10 7 pfu/^l diluted 1:1000 
then 1:100 to 5 x 10 2 pfu/^1. Then 8 f*l of phage dilution 
(5 x 10 J pfu/MD was plated in 200 fil host cells. They 
were then incubated in 15 ml tubes at 37 «C for 15 minutes. 

Approximately 4 ml of molten, LB top agarose (0.7%) at 
approximately 52 »C was added to each tube and the mixture 
was poured onto the surface of LB agar plates. The agar 
plates were then incubated at 37 -C for five hours. The 
plates were replicated and induced with 10 mM IPTG-soaked 
Duralon-OV™ nylon membranes (Stratagene Cloning Systems, 
La Jolla, CA) overnight. The nylon membranes and plates 
were marked with a needle to keep their orientation and the 
nylon membranes were then removed and stored at 4 °C. 

An Azo-galactomannan overlay was applied to the LB 
plates containing the lambda plaques. The overlay contains 
1% agarose, 50 mM potassium-phosphate buffer P H 7, 0.4% 
Azocarob-galactomannan. (Megazyme, Australia) . The plates 
were incubated at 72 »C. The Azocarob-galactomannan 
treated plates were observed after 4 hours then returned to 
incubation overnight. Putative positives were identified 
by clearing zones on the Azocarob-galactomannan plates . 
Two positive clones were observed. 

The nylon membranes referred to above, which 
correspond to the positive clones were retrieved, oriented 
over the plate and the portions matching the locations of 
the clearing zones for positive clones wre cut out. Phage 
was eluted from the membrane cut-out portions by soaking 
the individual portions in 500 ,il SM (phage dilution 
buffer) and 25 fil CHC1 3 . 



EX"™!? 1 " 5 

c^nlna o * ™ for Miuinosidaa* Activity 



WO 97/25417 W W PCT/US97/00092 

A solid phase screening assay was utilized as a 
primary screening method to test clones for S-mannosidaee 
activity. 

A culture solution of the Y1090-E. coli hoBt strain 
(Stratagene Cloning Systems, La Jolla, CA) was diluted to 
O.D. «o=1.0 with NZY media. The amplified library from 
AEPII la lambda gtll library was diluted in SM (phage 
dilution buffer) : 5 x 10 7 p£u//il diluted 1:1000 then 1:100 
co 5 x 10 2 pfu//tl- Then 8 (il of phage dilution 
(5 x 10 2 pfu/fiD was plated in 200 nl host cells. They 
were then incubated in 15 ml tubes at 37 °C for 15 minutes. 

Approximately 4 ml of molten, LB top agarose (0.7%) at 
approximately 52 °C was added to each tube and the mixture 
was poured onto the surface of LB agar plates. The agar 
plates were then incubated at 37 <»C for five hours. The 
plates were replicated and induced with 10 mM iPTG-soaked 
Duralon-UV™ nylon membranes (Stratagene Cloning Systems, 
La Jolla, CA) overnight. The nylon membranes and plates 
were marked with a needle to keep their orientation and the 
nylon membranes were then removed and stored at 4 °C. 

A p-nitrophenyl-S-D-manno-pyranoside overlay was 
applied to the LB plates containing the lambda plaques. 
The overlay contains 1% agarose, 50 mM potassium-phosphate 
buffer pH 7, 0.4% p-nitrophenyl-S-D-manno-pyranoside . 
(Megazyme, Australia) . The plates were incubated at 72 °C. 
The p-nitrophenyl-fi-D-manno-pyranoside treated plates were 
observed after 4 hours then returned to incubation 
overnight. Putative positives were identified by clearing 
zones on the p-nitrophenyl-B-D-manno-pyranoside plates. 
Two positive clones were observed. 

The nylon membranes referred to above, which 
correspond to the positive clones were retrieved, oriented 
over the plate and the portions matching the locations of 
the clearing zones for positive clones wre cut out. Phage - 
was eluted from the membrane cut-out portions by soaking 
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the individual portions in 500 fil SM (phage dilution 
buffer) and 25 pi CHCl 3 . 

Example 6 
screening for Pullulan aae Activity 

Screening procedures for pullulanase protein activity 
may be assayed for as follows: 

Substrate plates were provided by a standard plating 
procedure. Host cells are diluted to O.D.^x, - 1.0 with NZY 
or appropriate media. In 15 ml tubes, inoculate 200 fil 
diluted host cells with phage. Mix gently and incubate 
tubes at 37 °C for 15 min. Add approximately 3.5 ml LB top 
agarose (0.7%) is added to each tube and the mixture is 
plated, allowed to cool, and incubated at 37°C for about 28 
hours. Overlays of 4.5 mis of the following substrate are 

poured : 

mo ml to tal volume 

0.5g Red Pullulan Red (Megazyme, Australia) 

1 . Og Agarose 

5ml Buffer (Tris-HCL pH 7.2 © 75 °C) 

2ml 5M NaCl 

5ml CaCl 2 (lOOmM) 

85ml dH 2 0 

Plates are cooled at room temperature, and thenm incubated 
at 75 °C for 2 hours. Positives are observed as showing 
substrate degradation. 

Example 7 

Prrpenina for Endoctluc anase Activity 
Screening procedures for endoglucanase protein 

activity may be assayed for as follows: 

1. The gene library is plated onto 6 LB/GelRite/0 . 1% 

CMC/NZY agar plates (-4,800 plaque forming units/plate) in 

E.coli host with LB agarose as top agarose. The plates are 

incubated at 37«C overnight. 
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2. Plates are chilled at 4°C for one hour. 

3. The plates are overlayed with Duralon membranes 
(Stratagene) at room temperature for one hour and the 
membranes are oriented and lifted off the plates and stored 
at 4°C. 

4. The top agarose layer is removed and plates are 
incubated at 37«»C for -3 hours. 

5. The plate surface is rinsed with NaCl. 

6. The plate is stained with 0.1% Congo Red for 15 



minutes 
7 . 
8 . 



The plate is destained with 1M NaCl . 
The putative positives identified on plate are 
isolated from the Duralon membrane (positives are 
identified by clearing zones around clones) . The phage is 
eluted from the membrane by incubating in 500 M 1 SM + 25 M 1 

CHClj to elute. 

9. Insert DNA is subcloned into any appropriate 
cloning vector and subclones are reassayed for CMCase 
activity using the following protocol: 

i) Spin lml overnight miniprep of clone at 

maximum speed for 3 minutes. 

ii) Decant the supernatant and use it to fill 
"wells" that have been made in an LB/GelRite/0 . 1% CMC 
plate . 

iii) incubate at 37"C for 2 hours. 

iv) Stain with 0.1% Congo Red for 15 minutes. 

v) Destain with 1M NaCl for 15 minutes. 

vi) Identify positives by clearing zone around 

clone . 

Numerous modifications and variations of the present 
invention are possible in light of the above teachings and, 
therefore, within the scope of the appended claims, the 
invention may be practiced otherwise than as particularly 
described . 
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WHAT IS CLAIMED IS : 

la An isolated polynucleotide comprising a member 

selected from the group consisting of: 

(a) a polynucleotide having at least a 70% 
identity to a polynucleotide encoding an enzyme comprising 
amino acid sequences set forth in SEQ ID NOS : 15-28; 

(b) a polynucleotide which is complementary to 
the polynucleotide of (a) ; and 

(c) a polynucleotide comprising at least 15 
bases of the polynucleotide of (a) or (b) . 

2 The polynucleotide of Claim 1 wherein the 

polynucleotide is DNA. 

3. The polynucleotide of Claim 1 wherein the 
polynucleotide is RNA. 

4. The polynucleotide of Claim 2 which encodes an 
enzyme comprising an amino acid sequence which a member 
selected from the group 
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5 ^ iaolated polynucleotide comprising a member 

selected from the group consisting of: 

(a) a polynucleotide having at least a 70% 
identity to a polynucleotide encoding an enzyme encoded by 
the DNA contained in ATCC Deposit No. 97379, wherein said 
enzyme is selected from the group consisting of M11TL, 
0C1/4V, F1-12G, 9N2-31B/G, MSB8-6G, AEDII12RA- 18B/G . GC74- 

22G and VC1-7G1; 

(b) a polynucleotide complementary to the 

polynucleotide of (a) ; and 

(c) a polynucleotide comprising at least 15 
bases of the polynucleotide of (a) and (b) . 

6 A vector comprising the DNA of Claim 2 . 

7. A host cell comprising the vector of Claim 6. 

a . A process for producing a polypeptide comprising: 

expressing from the host cell of Claim 7 a polypeptide 
encoded by said DNA. 

g A process for producing a cell comprising: 

transforming or transfecting the cell with the vector of 
Claim 6 such that the cell expresses the polypeptide 
encoded by the DNA contained in the vector. 

10. An enzyme comprising a member selected from the 

group consisting of: 

(a) an enzyme comprising an amino acid sequence 
which is at least 70% identical to the amino acid sequence 
set forth in SEQ ID NOS: 15-28; and 

(b) an enzyme which comprises at least 3 0 amino 
acid residues to the enzyme of (a) . 
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11. A method for generating glucose from soluble 

cellooligosaccharides comprising : 

administering an effective amount of an enyzme 
selected from the group consisting of an enzyme having the 
amino acid sequence set forth in SEQ ID NOS : 15-28. 
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M11TL G LYCOS I OAS £ - 29G 
COMPLETE GENE SEQUENCE - 9/95 

rri: aaa m . i r AAA i;ac rr( at« . ata « :» .< ta* m a t» *t tc a ■-■-«. pit t AA vr \ i ,aa . .. ■: • 

H.-l l.v:. I'l... I'i.. t.ys AS|. I'll.' M.-i !)•■ 1y i . , (•?.. I'll.- liln I'll.- ..t,* ,\l . ■ 'n 

i.i cct ATT err ccf: T'v- t .A<; cat t -i i . aat act uAT tcc ret; cta TCC CTC cat cat ccc t ;Ac I .'u 

! i;ly Mr I'm C]y :U'f Clu A.:p I'i. • As;n <, t Asp Tip Trp V* I Trp V<* 1 Hi;; Asp I 411 

i. i aac Af.A i;ca cr-r or, a fTA ctc aci ccc cat ttt ccc cac aac ccc cca ci;t tac tcc aai ihh 

II Ar.n Tin A I .1 Al,i Cly L«*o V.n I Ni-r Cly A.-:p I'tM- Pin C.lu ASH (ily Pio Cly Ty 1 Tip A-;n i.U 

18! TTA AAC CAA AAT GAC CAC CAC CTC C(T OAC AA(i CTG CCG GTT AAC ACT ATT AHA OTA CGC AAQ 

<>l Leu Asn Gin Asn Asp His Asp um Al* Clu Lys Leu Cly Val Asn Thi lie Arq V*i 1 Cly 80 

24 1 CTT GAG TCC ACT ACG ATT TTT CCA AAG CCA ACT TTC AAT GTT AAA CTC CCT CTA GA(; AGA 3 00 

PI Val Glu Trp Ser Arg lie Phe Fro Lys I'io Thr Phe Asn val Lys Val Pro Val Glu Arg 100 

301 CAT GAG AAC CGC AGC ATT GTT CAC CTA GAT GTC CAT GAT AAA GCG CTT GAA AGA CTT CAT 360 

101 Asp Glu Asn Cly Ser lie Val His Val Asp Val Asp Asp Lys Ala Val Glu Arg Leu Asp 120 

36 1 GAA TTA CCC AAC AAG GAG GCC CTA AAC CAT TAC GTA GAA ATG TAT AAA GAC TGG GTT GAA 4 20 

121 Glu Leu Ala Asn Lys Glu Ala Val Asn His Tyr Val Glu Met Tyr Lys Asp Trp Val Glu 140 

421 AGA CCT AGA AAA CTT ATA CTC AAT TTA TAC CAT TGG CCC CTC CCT CTC TGG CTT CAC AAC 4 80 

14 1 Arg Cly Arg Lys Leu lie Leu Asn Leu Tyr His Trp Pro Leu Pro Leu Trp Leu His Asn 160 

481 CCA ATC ATG CTG AGA AGA ATG CGC CCG CAC AGA CCC CCC TCA GCC TGG CTT AAC GAG GAG b4 0 

161 Pro lie Met Val Arg Arg Met Cly Pro Asp Arg Ala Pro Ser Gly Trp Leu Asn Glu Glu 180 

541 TCC CTG CTG GAG TTT GCC AAA TAC CCC CCA TAC ATT CCT TGG AAA ATG GGC GAG CTA CCT 600 

181 Ser val. val Glu Phe Ala Lys Tyr Ala Ala Tyr He Ala Trp Lys Met Gly Glu Leu Pro 200 

601 GTT ATG TGG AGC ACC ATG AAC GAA CCC AAC CTC CTT TAT GAG CAA GGA TAC ATG TTC GTT 660 

201 Val Met Trp Ser Thr Met Asn Glu Pro Asn Val Val Tyr Glu Gin Gly Tyr Met Phe Val 220 

661 AAA GGG CCT TTC CCA CCC GGC TAC TTC ACT TTC GAA CCT CCT GAT AAG CCC AGG AGA AAT 720 

221 Lys Gly Gly Phe Pro Pro Gly Tyr Leu Ser Leu Glu Ala Ala Asp Lys Ala Arg Arg Asn 240 

721 ATG ATC CAG CCT CAT GCA CCG CCC TAT GAC AAT ATT AAA CGC TTC ACT AAG AAA CCT CTT 780 

241 Met He Gin Ala His Ala Arg Ala Tyr Asp Asn He Lys Arg Phe Ser Lys Lys Pro Val 260 

781 GGA CTA ATA TAC CCT TTC CAA TGG TTC GAA CTA TTA GAG GGT CCA GCA GAA GTA TTT CAT 84 0 

261 Gly Leu He Tyr Ala Phe Gin Trp Phe Clu Leu Leu Glu Gly Pro Ala Glu Val Phe Asp 280 

84 1 AAC TTT AAG AGC TCT AAC TTA TAC TAT TTC ACA GAC ATA GTA TCG AAG GGT AGT TCA ATC 900 

281 Lys Phe Lys Ser Ser Lys Leu Tyr Tyr Phe Thr Asp He Val Ser Lys Gly Ser Ser He 300 

901 ATC AAT GTT GAA TAC ACG ACA GAT CTT CCC AAT AGG CTA CAC TCG TTC GGC GTT AAC TAC 9 60 

301 He Asn val Glu Tyr Arg Arg Asp Leu Ala Asn Arg Leu Asp Trp Leu Gly Val Asn Tyr 320 

961 TAT AGC CCT TTA GTC TAC AAA ATC CTC CAT GAC AAA CCT ATA ATC CTC CAC CCC TAT GGA 1020 

321 Tyr Ser Arg Leu Val Tyr Lys He Val Asp Asp Lys Pro He He Leu His Gly Tyr Cly 340 

1021 TTC CTT TCT ACA CCT GCG GGG ATC ACC CCG GCT CAA AAT CCT TCT ACC GAT TTT CGC TCC 1080 

34 1 Phe Leu Cys Thr Pro Gly Gly lie Ser Pro Ala Glu Asn Pro Cys Ser Asp Phe Gly Trp 360 

10R1 GAC GTC TAT CCT CAA CCA CTC TAC CTA CTT CTA AAA GAA CTT TAC AAC CGA TAC CCG CTA U40 

361 Clu Val Tyr t»ro Clu Cly Leu Tyr Leu Leu Leu Lys Clu Lou Tyr Asn Arg Tyr Gly Val 380 

114 1 GAC TTC ATC CTC ACC CAC AAC GGT CTT TCA CAC ACC ACC i;AT CCC TTC AGA CCC GCA TAC 12 00 

381 Asp Leu lie Val Tin clw Asn Cly Va J Ser Asp Sec Aro Asp Ala Leu Arg Pro Ala Tyr 400 

1201 CTC GTC JVC CAT CTT TAC AGC CTA TCC, AAA CCC CCT AAI* (JAG GCC ATT CCC CTC AAA GCC I <ibU 

401 l.tMi V«i i 'Ui Mis V.i I Tyi Ser V^ I Tip Lys A 1 a Ala A-.n clu cly Ml* Pro Vol Lys Gly 1 !'n 

I .fO I TAC CTC i A* TCC Al :c TTC ACA CAt AAT TAC CAC T« ;( . c ;CC ( Ac CCC TTC A< a; CAC AAA TTi I C" 

n*] Ty» t.«-u Mir: Tip Sti |.cu Thr Asp A f :n Tyr Clu Trp Al.. Clu Cly I'h. Artf C)t» j.y ; . |*Im .Mi' 
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OCl/4 GLYCOSIDASE - 330/B 
COMPLETE GENE SEQUENCE - 9/95 



I ATC ATA ACA ACC TCC* CAT TTT ' *' * A AAA CAT TTT ATC TTC GCA A«C t :t -f A( < ; i;CA t K 'A TAC 

l Mf ( lie Arg Arg Ser Asp rh#* Pm i.v?5 A:;p Phe lie Plie Gly Thr Al.i Thr aIh A)^ Tyr 

U 1 i AG ATT GAA CGT CCA CCA AAC CAA CAT CCC ACA GCG CCA TCA ATT TCC CAT CTC TTT TCA 

?.\ r.Jn He Clu Cly Ala Ala Asm <ilu Asp Oly Arg Gly Pro Ser lie Tip Asp va I Phe Ser 

12 1 TAC ACG CCT CGC AAA ACC CTC AAC CGT CAC ACA GGA GAC GTT GCG TCT CAC CAT TAT CAC 
41 Thr Pro Cly Lys Thr Lou Asn Gly Asp Thr Gly Asp Va 1 Ala Cys Asp His Tyr His 

IH1 CGA TAC AAG GAA GAT ATC CAC CTC ATC AAA GAA ATA GGG TTA GAC CCT TAC ACC TTC TCT 

6J Arg Tyr Lys Glu Asp lie Gin Lou Met Lys Glu lie Gly Leu Asp Ala Tyr Arg Phe Ser 

2 41 ATC TCC TCC CCC AGA ATT ATG CCA CAT GCG AAG AAC ATC AAC CAA AAG CGT CTC GAT TTC 
81 He Ser Trp Pro Arg lie Met Pro Asp Gly Lys Asn lie Asn Gin Lys Gly val Asp Phe 

301 TAC AAC AGA CTC GTT GAT GAG CTT TTC AAG AAT GAT ATC ATA CCA TTC CTA ACA CTC TAT 

101 Tyr Asn Arg Leu val Asp Glu Leu Leu Lys Asn Asp He He Pro Phe Val Thr Leu Tyr 

361 CAC TGG GAC TTA CCC TAC GCA CTT TAT GAA AAA GCT GGA TGG CTT AAC CCA GAT ATA GCG 

121 His Trp Asp Leu Pro Tyr Ala Leu Tyr Glu Lys Gly Gly Trp Leu Asn Pro Asp He Ala 

421 CTC TAT TTC AGA GCA TAC GCA ACC TTT ATC TTC AAC GAA CTC GCT GAT CGT CTC AAA CAT 

14 1 Leu Tyr Phe Arg Ala Tyr Ala Thr Phe Met Phe Asn Clu Leu Gly Asp Arg val Lys Mis 

481 TGG ATT ACA CTC AAC GAA CCA TCC TCT TCT TCT TTC TCC GCT TAT TAC ACG GGA GAG CAT 

161 Trp He Thr Leu Asn Clu Pro Trp Cys Ser Ser Phe Ser Gly Tyr Tyr Thr Gly Glu His 

541 CCC CCC GCT CAT CAA AAT TTA CAA GAA GCG ATA ATC GCG GCG CAC AAC CTC TTC AGG GAA 

161 Ala Pro Gly His Gin Asn Leu Gin Glu Ala He He Ala Ala His Asn Leu Leu Arg Glu 

601 CAT GCA CAT CCC CTC CAC CCC TCC AGA CAA GAA CTA AAA GAT GGG GAA CTT CCC TTA ACC 

201 His Cly His Aia Val Gin Ala Ser Arg Glu Glu Val Lys Asp Gly Clu Val Gly Leu Thr 

661 AAC GTT CTC ATG AAA ATA GAA CCG CGC GAT CCA AAA CCC GAA ACT TTC TTC CTC GCA ACT 

221 Asn Val Vai Met Lys He Clu Pro Gly Asp Ala Lys Pro Glu Ser Phe Leu Val Aia Ser 

721 CTT CTT GAT AAC TTC CTT AAT GCA TGG TCC CAT GAC CCT CTT GTT TTC GGA AAA TAT CCC 

241 Leu Val Asp Lys Phe Val Asn Ala Trp Ser His Asp Pro Val Vai Phe Gly Lys Tyr Pro 

7B1 CAA CAA GCA CTT GCA CTT TAT ACC GAA AAA GCG TTC CAA CTT CTC CAT ACC CAT ATG AAT 

261 Glu Glu Ala Val Ala Leu Tyr Thr Glu Lys Gly Leu Gin Val Leu Asp Ser Asp Met Asn 

841 ATT ATT TCC ACT CCT ATA GAC TTC TTT CGT CTC AAT TAT TAC ACA AGA ACA CTT CTT CTT 

281 lie He Ser Thr Pro He Asp Phe Phe Gly Val Asn Tyr Tyr Thr Arg Thr Leu Val Val 

901 TTT GAT ATC AAC AAT CCT CTT GGA TTT TCC TAT CTT CAG GGA CAC CTT CCC AAA ACG GAG 

3 01 Phe Asp Met Asn Asn Pro Leu Gly Phe Ser Tyr Val Gin Gly Asp Leu Pro Lys Thr Glu 

961 ATG GGA TCC CAA ATC TAC CCG CAG CCA TTA TTT GAT ATG CTC GTC TAT CTG AAC GAA AGA 

321 Met Gly Trp Clu He Tyr Pro Gin Cly Leu Phe Asp Met Leu Val Tyr Leu Lys Glu Arg 

1021 TAT AAA CTA CCA CTT TAT ATC ACA GAG AAC CCC ATG CCT GCA CCT GAT AAA TTC GAA AAC 

341 Tyr Lys Leu Pro Leu Tyr He Thr Glu Asn Gly Met Ala Gly Pro Asp Lys Leu Glu Asn 

1081 CCA AGA CTT CAT CAT AAT TAC CCA ATT CAA TAT TTC CAA AAC CAC TTT CAA AAA CCA CTT 

361 Cly Arg val His Asp Asn Tyr Arg lie Glu Tyr Leu Glu Lys His Phe Glu Lys Ala Leu 

1141 GAA CCA ATC AAT CCA GAT CTT GAT TTC AAA CCT TAC TTC ATT TCC TCT TTC ATC GAT AAC 

181 Glu Ala He Asn Ala Asp Val Asp Leu Lys Gly Tyr Phe lie Trp ser Leu Met Asp Asn 

1201 TTC GAA TCC CCC TCC GGA TAC TCC AAA CCT TTC CCT ATA ATC TAC CTA CAT TAC AAT ACC 

401 Phe Clu Trp Ala Cys t;ly Tyr Ser Lys Arg Phe Gly lie lie Tyr Val Asp Tyr Asn Thr 

12 61 CCA AAA ACC ATA TTT, AAA CAT TCA CCC ATC TCC TTC AAC GAA TTT CTA AAA TCT TAA 1 3 I 

421 Pro Lys Arc; lit- I.<mi Uys Asp Ser Ala Met Trp Leu Lys Clu Phi* Leu Lys Ser End 4 I't 
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STAPH YLOTHERMUS MARINUS G LYCOS IDAS C - 12C 
COMPLETE GENE SEQUENCE 
9/95 



1 TTC ATA ACC TTT CCT CAT TAT TTC TTl TTT CCA A< A t;< ~r AC A TrA TCCI fA( ( AC AT( t :a< . 

I Met Me Arq Pne Pro Asp Tyr Phe Leu Phe c.\y Thr aIa Thr set Ser His On l l * * <;tw 

61 GCT AAT AAC ATA TTT AAT CAT TCC TCG GAG TCG GAG ACT AAA CGC AGG ATT AA<; t;Tr; A(;a 

21 Cly Asn Asn lie Phe Asn Asp Trp Trp Glu Trp Glu Thr Lys Gly Arg lie Lys Va I Arq 

121 TCG GCT AAG CCA TGT AAT CAT TGG GAA CTC TAT AAA GAA CAC ATA GAG CTT ATC CCT GAG 

41 Ser Gly Lys Ala Cys Asn His Trp Glu Leu Tyr Lys Glu Asp lie Glu Leu Met Ala Glu 

181 CTC CGA TAT AAT GCT TAT AGG TTC TCC ATA GAG TGG ACT AGA ATA TTT CCC AG A AAA CAT 

61 Leu Gly Tyr Asn Ala Tyr Arg Phe Ser He Glu Trp Ser Arg tie Phe Pro Arg Lys Asp 

241 CAT ATA GAT TAT GAG TCC CTT AAT AAG TAT AAG GAA ATA CTT AAT CTA CTT AGA AAA TAC 

81 His He Asp Tyr Glu Ser Leu Asn Lys Tyr Lys Glu He Val Asn Leu Leu Arg Lys Tyr 

301 GGG ATA GAA CCT CTA ATC ACT CTT CAC CAC TTC ACA AAC CCC CAA TGG TTT ATC AAA ATT 

101 Cly He Glu Pro Val He Thr Leu His His Phe Thr Asn Pro Gin Trp Phe Met Lys He 

361 GCT GGA TCG ACT AGG GAA GAG AAC ATA AAA TAT TTT ATA AAA TAT CTA GAA CTT ATA CCT 

121 Gly Gly Trp Thr Arg Glu Glu Asn He Lys Tyr Phe He Lys Tyr Val Glu Leu He Ala 

421 TCC GAG ATA AAA GAC GTG AAA ATA TGG ATC ACT ATT AAT GAA CCA ATA ATA TAT CTT TTA 

141 Ser Glu He Lys Asp Val Lys He Trp He Thr He Asn Glu Pro He He Tyr Val Leu 

481 CAA GGA TAT ATT TCC GGC GAA TGG CCA CCT GGA ATT AAA AAT TTA AAA ATA CCT GAT CAA 

161 Gin Gly Tyr He Ser Gly Glu Trp Pro Pro Gly He Lys Asn Leu Lys He Ala Asp Gin 

541 CTA ACT AAC AAT CTT TTA AAA GCA CAT AAT GAA GCC TAT AAT ATA CTT CAT AAA CAC GCT 

181 val Thr Lys Asn Leu Leu Lys Ala His Asn Glu Ala Tyr Asn He Leu His Lys His Gly 

601 ATT CTA GGC ATA GCT AAA AAC ATC ATA GCA TTT AAA CCA GGA TCT AAT AGA GGA AAA GAC 

201 He Val Gly He Ala Lys Asn Met He Ala Phe Lys Pro Gly Ser Asn Arg Gly Lys Asp 

661 ATT AAT ATT TAT CAT AAA CTC GAT AAA GCA TTC AAC TGG GGA TTT CTC AAC GGA ATA TTA 

221 He Asn He Tyr His Lys Val Asp Lys Ala Phe Asn Trp Gly Phe Leu Asn Gly He Leu 

721 AGG GGA GAA CTA GAA ACT CTC CCT GGA AAA TAC CGA CTT GAG CCC GGA AAT ATT GAT TTC 

241 Arg Gly Glu Leu Glu Thr Leu Arg Gly Lys Tyr Arg Val Glu Pro Gly Asn He Asp Phe 

781 ATA GGC ATA AAC TAT TAT TCA TCA TAT ATT CTA AAA TAT ACT TGG AAT CCT TTT AAA CTA 

261 He Gly He Asn Tyr Tyr Ser Ser Tyr He Val Lys Tyr Thr Trp Asn Pro Phe Lys Leu 

841 CAT ATT AAA CTC GAA CCA TTA GAT ACA GCT CTA TGG ACA ACT ATG GGT TAC TCC ATA TAT 

281 His He Lys Val Glu Pro Leu Asp Thr Gly Leu Trp Thr Thr Met Gly Tyr Cys He Tyr 

901 CCT AGA GGA ATA TAT GAA GTT CTA ATG AAA ACT CAT GAG AAA TAC GCC AAA CAA ATA ATC 

301 Pro Arg Gly He Tyr Glu Val Val Met Lys Thr His Glu Lys Tyr Cly Lys Glu He He 

961 ATT ACA GAG AAC GCT GTT GCA CTA GAA AAT GAT GAA TTA AGG ATT TTA TCC ATT ATC AGG 

321 He Thr Glu Asn Cly Val Ala Val Glu Asn Asp Glu Leu Arg He Leu Ser He He Arg 

1021 CAC TTA CAA TAC TTA TAT AAA CCC ATC AAT CAA CCA CCA AAG GTG AAA GGA TAT TTC TAC 

34 1 His Leu Gin Tyr Leu Tyr Lys Ala Met Asn Glu Cly Ala Lys Val Lys Cly Tyr Phe Tyr 

1081 TCG ACC TTC ATC GAT AAT TTT GAG TCC CAT AAA CGA TTT AAC CAA AGG TTC GCA CTA CTA 

361 Trp Ser Phe Met Asp Asn Phe Glu Trp Asp Lys Cly Phe Asn Gin Arg Phe Cly Leu Val 

114 1 GAA CTT CAT TAT AAC ACT TTT GAG AGA AAA CCT AGA AAA ACC CCA TAT CTA TAT ACT CAA 

381 Glu Val Asp Tyr Lys Thr Phe Giu Arg Lys Pro Arg Lys Ser Ala Tyr Va i Tyr Ser Gin 

12 01 ATA GCA CCT ACC AAG ACT ATA ACT t!AT GAA TAC CTA GAA AAA TAT GGA TTA AAC* AAC CTC* 

401 lie Ala Arg Thr Lys Thr lie Ser Asp Glu Tyr Leu Glu Lys Tyr uly Leu Lys Asn Leu 

126 1 GAA TAA 1266 

4 2 1 Glu End 422 
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Tn«rmoccK-rv t <iU?. G I v •: *?i vdasc- HB/O 
Coirplet«ir «e:ie »^wnce 9/95 
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1021 TAC ACC ACA GAA CTC GTC ACG TAT TCC CAC CCC AAC TTC CCC AGC ATA CCC CTG ATA TCC 1080 

3 41 Tyi Thz Arg Glu Val Val Arg Tyr Ser Clu Pro Uys Ph* Pro Sax !!• Pro Lau ila Sar 360 

10 81 TTC CGG GCA CTT CAC AAC TAC GGC TAC GCC TGC AGC CCC CCC ACT TCT TCC GCC CAC CCA 1140 

361 Pha Arg Cly val nii Aan Tyr Cly Tyr Ala Cy* Arg Pro Gly sax Sar Sar Ala Asp Cly 180 

1141 ACC CCC GTA ACC GAC ATC CCC TOO GAG ATC TAT CCC CAC CCC ATC TAC GAC TCG ATA ACA 1200 

381 Arg Pro Val Scr Asp Ila Cly Trp Cla lla Tyr Pro CW Gly Ha Tyr Aap Sar Ila Ajrg 400 

1201 GAG GCC AAC AAA TAC GCC CTC CCC CTT TAC OTC ACC GAA AAC CCA ATA CCC GAT TCX ACT 1360 

401 Glu Ala ash Lys Tyr Cly Val Pro uaL Tyr val Thr Clu Asn Cly Ila Ala Asp 8m* Thr 420 

1261 CAC ACC CTC CGG CCC TAC TAC CTC CCO ACC CAT CTA CCC AAO ATT CAC CAG GCC TAC GAG 1320 

421 Asp Thr Lau Arg Pro Tyr Tyr Lau Ala Scr «i» val Ala Lys Ila Clu Clu Ala Tyr Olu 440 
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U^l GCG CCT TXC GAC CTC ACC C"C ?AC CT< TAC TCG CCC CTH ACTC GAC AAC TAC GAG TOG GCC 1J80 

44 1 Ala Giy Tyr Asp v*l Aro GIv Tyr L»u T\rr Trp Ala L»u Thr A#o A*n Tyr Glu Trp Ala 4 60 

.J8i CTC CGT TTC AGC ATG AGC TTC GGC CTC TAT AAA CTC OAT CTC ATA ACC AAG GAG AO* ACA 14 40 

461 L»u Gly Arg ««c Axq ?n« Gly Um Tyc Ly» Val Asp L«u lis Thr Lyj GUu Arg Thr 480 

L441 CCC CCC GAG GAA ACC GTA AAG GTT TAT ACC CCC ATC CTC GAG AAC AAC GGA GTC AOC AAC 1500 

461 Pro Ax 9 Clu Glu Ser v*l L.fm val Tyr Arg Gi-y I*u Val Glu Abo Kmn Gly Val Ser Ly* 500 

L50* GAA ATC CGG GAG AAC TTC CCA CTT TCA 1510 

501 Glu Il« Arg Glu !.>•■ Pb* Gly Gly End ^10 
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THERMOCOCCT7S AEDXXX2RA OLTCOSIDAflE (188/0) 
COMPLETE OEMS SEQUENCE - 9/95 

1 ATC ATC CAC TCC CCC CTT AAA GCC ATT ATA TCT GAG CCT CCC GCC ATA ACC ATC ACA ATA 

t Met lie Mi* Cys Pro Val Lys Gly lie lie Ser Glu Ala Arg Cly liv Thr lie Thr lie 

6 1 CAT TTA ACT TTT CAA GCC CAA ATA AAT AAT TTC CTC AAT CCT ATC ATT CTC TTT CCC CAC 

21 Asp Leu Ser Phe Cln Gly Cln lie Asn Asn Leu Val Asn Ala Met lie Val Phe Pro Glu 

121 TTC TTC CTC TTT CCA ACC CCC ACA TCT TCT CAT CAC ATC CAC CCA CAT AAT AAA TCG AAC 

4 1 Phe Phe Leu Phe Cly Thr Ala Thr Ser Ser His Cln lie Clu Cly Asp Asn Lys Trp Asn 

181 GAC TCG TCC TAT TAT CAC CAC ATA CCT AAC CTC CCC TAC AAA TCC CCT AAA CCC TCC AAT 

61 Asp Trp Trp Tyr Tyr Glu Clu lie Cly Lys Leu Pro Tyr Lys Ser Cly Lys Ala Cys Asn 

241 CAC TCC CAC CTT TAC ACC CAA GAT ATA CAC CTA ATC CCA CAC CTC CCC TAC AAT CCC TAC 

81 His Trp Clu Leu Tyr Arg clu Asp He Glu Leu Met Ala Gin Leu Gly Tyr Asn Ala Tyr 

301 CCC TTT TCC ATA CAC TCC ACC CCT CTC TTC CCC CAA GAG GCC AAA TTC AAT CAA CAA CCC 

101 Arg Phe Ser lie Clu Trp Ser Arg Leu Phe Pro Clu Clu Cly Lys Phe Asn Glu Glu Ala 

361 TTC AAC CCC TAC CCT CAA ATA ATT GAA ATC CTC CTT GAG AAG CCC ATT ACT CCA AAC CTT 

121 Phe Asn Arg Tyr Arg Clu lie He Clu He Leu Leu Glu Lys Gly He Thr Pro Asn Val 

421 ACA CTG CAC CAC TTC ACA TCA CCC CTC TCC TTC ATC CCC AAG CCA GCC TTT TTC AAC GAA 

141 Thr Leu His His Phe Thr Ser Pro Leu Trp Ph« Met Arg Lys Gly Gly Phe Leu Lys Glu 

481 CAA AAC CTC AAC TAC TCG GAC CAC TAC CTT GAT AAA GCC GCG GAG CTC CTC AAC GGA CTC 

161 Clu Asn Leu Lys Tyr Trp Glu Gin Tyr Val Asp Lys Ala Ala Glu Leu Leu Lys Gly Val 

541 AAG CTT CTA CCT ACA TTC AAC GAG CCC ATC CTC TAT CTT ATC ATC GCC TAC CTC ACA GCC 

181 Lys Leu val Ala Tnr Phe Asa Clu Pro Met Val Tyr Val Met Met Gly Tyr Leu Thr Ala 

601 TAC TCC CCC CCC TTC ATC AAG ACT CCC TTT AAA GCC TTT AAA CTT CCC CCA AAC CTC CTT 

201 Tyr Trp Pro Pro Phe He Lys Ser Pro Phe Lys Ala Phe Lys Val Ala Ala Asn Leu Leu 

661 AAC CCC CAT CCA ATC CCA TAT CAT ATC CTC CAT CCT AAC TTT CAT CTC GCC ATA CTT AAA 

221 Lys Ala His Ala Mat Ala Tyr Asp Zle Leu His Gly Asn Phe Asp Val Gly He Val Lys 

721 AAC ATC CCC ATA ATC CTC CCT CCA ACC AAC ACA GAC AAA CAC CTA GAA CCT CCC CAA AAG 

241 Asn lie Pro He Met Leu Pro Ala Ser Asn Arg Glu Lys Asp Val Glu Ala Ala Gin Lys 

781 GCC GAT AAC CTC TTT AAC TCC AAC TTC CTT GAT CCA ATA TCC ACC CCA AAA TAT AAA CCA 

261 Ala Asp Asn Leu Phe Asn Trp Aan Phe Leu Asp Ala He Trp Ser Cly Lys Tyr Lys Cly 

841 CCT TTT CCA ACT TAC AAA ACT CCA CAA ACC CAT CCA CAC TTC ATA GCG ATA AAC TAC TAC 

281 Ala Phe Gly Thr Tyr Lys Thr Pro Glu Ser Asp Ala Asp Phe He Gly Ha Asn Tyr Tyr 

901 ACA CCC ACC GAC CTA ACC CAT ACC TCC AAT CCC CTA AAC TTT TTC TTC GAT CCC AAG CTT 

301 Thr Ala Ser Glu Val Arg His Ser Trp Asn Pro Leu Lys Phe Phe Phe Asp Ala Lys Leu 

961 CCA GAC TTA AGC GAG ACA AAA ACA CAT ATC OCT TCC ACT CTC TAT CCA AAG GCC ATA TAC 

321 Ala Asp Leu Sar Glu Arg Lys Thr Asp Met Cly Trp Ser Val Tyr Pro Lys Gly He Tyr 

1021 GAA CCT ATA CCA AAC CTT TCA CAC TAC CCA AAC CCA ATC TAC ATC ACC GAA AAC CCC ATA 

341 Clu Ala He Ala Lys Val Ser His Tyr Cly Lys Pro Met Tyr He Thr Clu Asn Gly He 

1081 CCT ACC TTA CAC CAT CAC TCC ACC ATA GAC TTT ATC ATC CAC CAC CTC CAC TAC CTT CAC 

3 61 Ala Thr Leu Asp Asp Glu Trp Arg He Glu Phe He Xla Cln His Leu Gin Tyr Val His 

1141 AAA CCC TTA AAC CAT GCC TTT GAC TTC ACA GCC TAC TTC TAT TCC TCT TTT ATC CAT AAC 

3 81 Lys Ala Leu Asn Asp Gly Phe Asp Leu Arg Gly Tyr Phe Tyr Trp Ser Phe Met Asp Asn 

1201 TTC CAC TCC CCT CAC CCT TTT ACA CCA CCC TTT CCC CTC CTC CAC CTC GAC TAC ACC ACC 

401 Phe Clu Trp Ala Glu Gly Phe Arg Pro Arg Phe Gly Leu val Clu Val Asp Tyr Thr Thr 

12 61 TTC AAC ACC ACA CCC ACA AAC ACT CCT TAC ATA TAT GGA CAA ATT CCA AGC GAA AAC AAA 

421 Phe Lys Arg Arg Pro Arg Lys Ser Ala Tyr He Tyr Gly Clu He Ala Arg Clu Lys Lys 

1321 ATA AAA GAC CAA CTC CTC CCA AAC TAT CCC CTT CCC CAC CTA TCA 1365 

44 1 lie Lys Asp Clu Leu Leu Ala Lys Tyr Gly Leu Pro Glu Leu End 455 
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THERMOCOCCUS CHITONOPHAGUS CLYCOSIDASE 
COMPLETE SEQUENCE - 9/ 95 



22Q 



1 TTG CTT CCA GAG AAC TTT CTC TCC GGA GTT TCA CAG TCC GGA TTC CAC TTT GAA ATC CCC 

I Met Leu Pro Clu Asn Phe Leu Trp Giy Va 1 Ser Gin Ser Cly Phe Gin Phe Clu Met Gly 

61 CAC ACA CTC AGG AGG CAC ATT CAT CCA AAC ACA GAT TCC TCC TAC TCC GTA AGA CAT GAA 

21 Asp Arg Leu Arg Arg His lie Asp Pro Asn Thr Asp Trp Trp Tyr Trp Val Arg Asp Clu 

121 TAT AAT ATC AAA AAA GGA CTA GTA ACT CGC CAT CTT CCC GAA CAC CCT ATA AAT TCA TAT 

41 Tyr Asn He Lys Lys Gly Leu Val Ser Gly Asp Leu Pro Glu Asp Cly lie Asn Ser Tyr 

181 GAA TTA TAT GAG AGA GAC CAA GAA ATT GCA AAG GAT TTA GGG CTC AAC ACA TAT AGG ATC 

61 Glu Leu Tyr Glu Arg Asp Gin Glu He Ala Lys Asp Leu Cly Leu Asn Thr Tyr Arg He 

241 GGA ATT CAA TGG ACC AGA GTA TTT CCA TGG CCA ACG ACT TTT CTC GAC CTC GAC TAT GAA 

81 Cly Il« Glu Trp Ser Arg Val Phe Pro Trp Pro Thr Thr Phe Val Asp val Clu Tyr Glu 

301 ATT CAT GAG TCT TAC GGG TTC GTA AAG GAT CTC AAC ATT TCT AAA GAC CCA TTA CAA AAA 

101 He Asp Glu Ser Tyr Gly Leu Val Lys Asp Val Lys He Ser Lys Asp Ala Leu Clu Lys 

361 CTT GAT GAA ATC CCT AAC CAA AGG GAA ATA ATA TAT TAT AGG AAC CTA ATA AAT TCC CTA 

121 Leu Asp Glu He Ala Asn Gin Arg Glu He He Tyr Tyr Arg Asn Leu He Asn Ser Leu 

421 AGA AAG AGG CCT TTT AAG CTA ATA CTA AAC CTA AAT CAT TTT ACC CTC CCA ATA TGG CTT 

141 Arg Lys Arg Gly Phe Lys Val He Leu Asn Leu Asn His Phe Thr Leu Pro He Trp Leu 

481 CAT GAT CCT ATC CAA TCT AGA GAA AAA GCC CTG ACC AAT AAG AGA AAC GGA TGG GTA AGC 

161 His Asp Pro He Glu Ser Arg Glu Lys Ala Leu Thr Asn Lys Arg Asn Gly Trp Val Ser 

541 GAA AGG ACT GTT ATA GAG TTT GCA AAA TTT GCC GCG TAT TTA GCA TAT AAA TTC GGA GAC 

181 Glu Arg Ser Val He Glu Phe Ala Lys Phe Ala Ala Tyr Lau Ala Tyr Lys Phe Gly Asp 

601 ATA GTA GAC ATG TGG AGC ACA TTT AAT GAA CCT ATG GTG CTC GCC GAG TTC GGG TAT TTA 

201 He Val Asp Met Trp Ser Thr Phe A*n Clu Pro Met Val Val Ala Glu Leu Cly Tyr beu 

661 GCC CCA TAC TCA GGA TTC CCC CCC GGA CTC ATG AAT CCA GAA GCA GCA AAG TTA GTT ATG 

221 Ala Pro Tyr S«r Gly Phe Pro Pro Gly val Met Asn Pro Glu Ala Ala Lys Leu Val Met 

721 CTA CAT ATG ATA AAC GCC CAT GCT TTA GCA TAT AGG ATG ATA AAG AAA TTT GAC AGA AAA 

241 Leu His Met He Asn Ala His Ala Vu Ala Tyr Arg Met lie Lys Lys Phe Asp Axg Lys 

781 AAA GCT GAT CCA GAA TCA AAA GAA CCA CCT GAA ATA GGA ATT ATA TAC AAT AAC ATC GCC 

261 Lys Ala Asp Pro Glu Ser Lys Clu Pro Ala Glu lie Gly He He Tyr Asn Asn lie Gly 

841 CTC ACA TAT CCG TTT AAT CCC AAA GAC TCA AAG GAT CTA CAA GCA TCC GAT AAT GCC AAT 

281 val Thr Tyr Pro Phe Asn Pro Lys Asp Ser Lys Asp Leu Gin Ala Ser Asp Asn Ala Asn 

901 TTC TTC CAC ACT GGG CTA TTC TTA ACG GCT ATC CAC AGG GGA AAA TTA AAT ATC GAA TTT 

301 Phe Phe His Ser Gly Leu Phe Leu Thr Ala He His Arg Gly Lys Leu Asn He Glu Phe 

961 GAC GGA GAG ACA TTT CTT TAC CTT CCA TAT TTA AAG CGC AAT GAT TGG CTG GGA CTC AAT 

321 Asp Gly Clu Thr Phe Val Tyr Leu Pro Tyr Leu Lys Gly Asn Asp Trp Leu Gly Val Asn 

1021 TAT TAT ACA AGA GAA CTC CTT AAA TAC CAA GAT CCC ATC TTT CCA ACT ATC CCT CTC ATA 

341 Tyr Tyr Thr Arg Glu Val Val Lys Tyr Gin Asp Pro Met Phe Pro Ser lie Pro Leu He 

1081 AGC TTC AAG GCC CTT CCA GAT TAT GGA TAC GGA TCT AGA CCA GGA ACC ACC TCA AAG GAC 

361 Ser Phe Lys Gly Val Pro Asp Tyr Gly Tyr Gly Cys Arg Pro Cly Thr Thr Ser Lys Asp 

1141 GCT AAT CCT GTT ACT GAC ATT GGA TGG GAG CTA TAT CCC AAA CGC ATG TAC GAC TCT ATA 

381 Cly Asn Pro val Ser Asp He Gly Trp Clu Val Tyr Pro Lys Gly Met Tyr Asp Ser lie 

1201 GTA GCT CCC AAT GAA TAT GGA CTT CCT GTA TAC GTA ACA GAA AAC GGA ATA GCA GAT TCA 

401 val Ala Ala Asn Clu Tyr Cly Val Pro Val Tyr val Thr Clu Asn Cly He Ala Asp Ser 

1261 AAA CAT CTA TTA AGG CCC TAT TAC ATC GCA TCT CAC ATT CAA CCC ATC GAA GAC GCT TAC 

421 Lys Asp val Leu Arg Pro Tyr Tyr He Ala Ser His He Clu Ala Met Glu Glu Ala Tyr 
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1 \ 2 1 CAA AAT CCT 

44 1 Glu Asn Giy 

1381 CCC TTA GGC 

461 Ala Leu Gly 

1441 AAA CCC AGO 

481 Lys Pro Arg 

1501 ACC AAC ATC 

501 Ser Asn lie 
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tftskit gouldi endogluesnaee (37GP1) 

9 18 27 36 45 $4 

5' ATG AGA ATA CGT TTA GCG ACQ CTC GCO CTC TCC CCA GCG CTG AOC CCA CTC ACC 
Mot Arg Zle Arg Leu Ala Thr Leu Ala Leu Cys Ala Ala Lou Sar Pro Val Thr 

S3 72 tl 90 99 108 

TTT CCA GAT AAT GTA ACC GTA CAA ATC GAC GCC GAC GCC GGT AAA AAA CTC ATC 
phe Ala Asp Aan Val Thr Val Gin Zle Asp Ala Asp Gly Cly Lye Lys Leu Zle 

117 126 135 144 153 162 

ACC CGA GCC CTT TAC GGC ATC AAT AAC TCC AAC CCA CAA ACC CTT ACC GAT ACT 
Sex Arg Ala Leu Tyr Gly Hat Aan Aan Ser Asn Ala Glu ser Leu Thr Asp Thr 

171 ISO 1B9 19B 207 216 

GAC TGG CAO CGT TTT CCC GAT CCA GGT GIG CGC ATG CTG CGG GAA AAT GGC GGC 
Asp Trp Gin Arg Phe Arg Asp Ala Gly Val Arg Met Leu Arg Glu Aan Gly Gly 

225 234 243 252 261 270 

AAC AAC ACC ACC AAA TAT AAC TGG CAA CTC CAC CTC ACC ACT CAT CCG GAT TGG 
Aan Aan Ser Thr Lys Tyr Asn Trp Gin Leu His Leu Ser Ser His Pro Asp Trp 

279 288 297 306 315 324 

TAC AAC AAT CTC TAC GCC GCC AAC AAC AAC TGG GAC AAC CGG GTA GCC CTG ATT 
Tyr Aan Asn Val Tyr Ala Gly Asn Asn Asn Trp Asp Asn Arg Val Ala Leu lie 

333 342 351 360 369 378 

CAG GAA AAC CTG CCC GGC GCC GAC ACC ATC TGG CCA TTC CAC CTC ATC GGT AAC 
Gin Glu Aan Leu Pro Gly Ala Asp Thr Met Trp Ala Phe Gin Leu He Gly Lys 

387 396 405 414 423 432 

QTC GCG GCG ACT TCT GCC TAC AAC TTT AAC GAT TGG GAA TTC AAC CAG TCG CAA 
Val Ala Ala Thr Ser Ala Tyr Asn Phe Asn Asp Trp Glu Phe Asn Gin Ser Gin 

441 450 459 468 477 486 

TGG TGG ACC GGC GTC GOT CAG AAT CTC GCT GGC GGC GGT GAA CCC AAT CTG GAC 

Trp Trp Thr Gly Val Ala Gin Asn Leu Ala Gly Gly Gly Glu Pro Asn Leu Asp 

495 504 513 522 531 540 

GGC GGC GGC GAA GCG CTG GTT GAA GGA GAC CCC AAT CTC TAC CTC ATG GAT TGG 
Gly Gly Gly Glu Ala Leu Val Clu Cly Asp Pro Asn Leu Tyr Leu Hat Asp Trp 

549 556 567 576 585 594 

TCG CCA GCC GAC ACT GTG GGT ATT CTC GAC CAC TCG TVT GGC GTA AAC GGG CTC 
Ser Pro Ala Asp Thr Val Gly Zle Leu Asp His Trp Phe Gly Val Asn Gly Lou 

603 612 621 630 639 648 

GGC GTG CGG CGT GGC AAA GCC AAA TAC TOO ACT ATG GAT AAC GAG CCC GGC ATC 
Gly Val Arg Arg Gly Lys Ala Lys Tyr Trp Sex Met Asp Asn Glu Pro Gly lie 

657 666 675 684 693 702 

TGG GTT GGC ACC CAC GAC GAT GTA OTG AAA GAA CAA ACG CCG GTA GAA GAT TTC 
Trp Val Gly Thr His Asp Asp Val Val Lys Glu Gin Thr Pro Val Glu Asp Phe 
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Lye 


ATC 
He 


76S 
ACC 
Thr 


GGT 
Oly 


CCC 
Pro 


774 

CTG 
Val 


CCC 
Pro 


GCT 
Ala 


783 
AAT 
Aan 


GAG 
Glu 


TOG 
Trp 


792 
GAG 

Gin 


TGG 
Trp 


TAT 

Tyr 


801 
GCC 
Ala 


TGG 
Trp 


810 
GGC GGT 
Gly Gly 


TTC 
Phe 


TCO 
Ser 


819 
GTA 
Val 


CCC 
Pro 


CAO 
Gin 


828 
GAA 

Glu 


CAA 

Gin 


GGG 
Cly 


837 
TTT 
PUS 


ATO 
Mat 


AGC 
Ser 


846 

TGG 
Trp 


ATG 
Met 


GAG 

Glu 


855 
TAT 
Tyr 


TTC 
Phe 


864 
ATC AAG 

He Lys 


CGG 

Arg 


GTG 
Val 


873 

TCT 
Sex 


GAA 

aiu 


GAC 

Glu 


882 
CAA 

Gin 


CGC 
Arg 


GCA 
Ala 


891 

ACT GGT 
Ser Gly 


GTT 
val 


900 
CGC 
Arg 


CTC 
Leu 


CTC 
Leu 


909 
GAT 
Asp 


GTA 
Val 


918 
CTC GAT 
Leu Asp 


CTO 
Leu 


CAC 
His 


927 
TAC 
Tyr 


TAC 
Tyr 


CCC 
pro 


936 
GGC 

Gly 


GCT 
Ala 


TAC 

Tyr 


945 
AAT 

Asn 


GCG 
Ala 


GAA 

Glu 


954 

GAT 
ASP 


ATC 
lis 


GTG 
Val 


963 
CAA 
Gin 


TTA 
Leu 


972 
CAT CGC 
His Arg 


ACG 
Thr 


TTC 

Phe 


981 
TTC 

Phe 


GAC 
A«P 


CGC 
Arg 


990 
GAC 
Asp 


TTT 
Phe 


GTT 
Val 


999 
TCA 
Sar 


CTG 
Leu 


1008 
GAT GCC 
Asp Ala 


1017 
AAC GGG GTG 
Asn Gly val 


AAA 

Lys 


1026 
ATG GTA 

Met Val 


OlU 


1035 
GGT QGC 
Cly Oly 


TQQ 

Trp 


1044 

GAT GAC 
Asp Asp 


Sax 


1053 
ATC AAC 
He Asn 


AAG 
Lys 


1062 
GAA TAT 

Glu Tyr 


1071 
ATT TTC GGG 
He Pho Gly 


VGA 
Arg 


1080 

GTG AAC 
Val Asn 


GAT 
Asp 


1089 

TGG CTC 
Trp Leu 


GAC 
Glu 


1098 
GAA TAT 

Glu Tyr 


ATO 
Met 


1107 
GOG CCA GAC 
Gly Pro Asp 


1116 
CAT GOT 

His Gly 


GTA 
Val 


1125 
ACC CTG 

Thr Leu 


GGC 
Gly 


1134 
TTA ACC 

Leu Thr 


GAA 

Glu 


1143 
ATO TGC 
Met Cya 


val 


1152 
CGC AAT 
Arg Asa 


GTG 
Val 


1161 
AAT CCG 
Aan Pro 


ATO 
Mar 


1170 
ACT ACC 
Thr Thr 


GCC 
Ala 


1179 
ATC TGG 
lie Trp 


TAT 
Tyr 


1188 
GCC TCC 
Ala Sar 



1197 1206 1215 1224 1233 1242 

ATG CTC GGC ACC TTC GCG GAT AAC GGC CTC GAA ATA TTC ACC CCA TGG TGC TGG 
Met Leu Gly Thr Phe Ala Aap Asn Gly Val Glu He Phe Thr Pro Trp Cys Trp 

1251 1260 1269 1278 1287 1296 

AAC ACC GGA ATG TGG GAA ACA CTC CAC CTC TTC AGC CGC TAC AAC AAA CCT TAT 
Aan Thr Gly Mat Trp Glu Thr Leu Els Leu Phe Ser Arg Tyr Aan Lys Pro Tyr 

1305 1314 1323 1332 1341 1350 

CGG GTC GCC TCC AGC TCC AGT CTT GAA GAG TTT GTC AGC GCC TAC AGC TCC ATT 
Arg Val Ala Ser Ser Sex Ser Leu Glu Glu Phe Val Ser Ala Tyr Ser Ser He 

1359 1368 1377 1386 1395 1404 

AAC GAA GCA GAA OAC GCC ATG ACG GTA CTT CTG GTG AAT CGT TCC ACT AGC GAC 
Asn Glu Ala Glu Asp Ala Met Thr Val Leu Leu Val Asn Arg Ser Thr Ser Glu 
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1413 1422 1431 "40 "« 14S * 

ACC CAC ACC GCC ACT GTC GCT ATC GXC GAT TTC CCA CTG GAT GGC CCC TAC CCC 
£ £. S » Val Ala lie Asp A-p Ph. Pro U>u A«p Gly Fro Tyr Axg 

1467 1476 1485 1«4 1503 1512 

XCC CTG CGC TTA CAC AAC CTG CCG GGG GAG GAA ACC TTC GTA TCT CAC CGA GAC 
tS L-u Arg Leu Hia Aaa Leu Pro Oly Glu Glu Thr Phe val Ser Bx. Arg Asp 

1521 1530 1539 1548 1557 1566 

AAC GCC CTO GAA AAA GGT ACA GTG CGC CCC AGC GAC AAT ACG GTA ACA CTC CAC 
A*n Ale Le« Glu Ly» Gly Thr Val Arg Ale Ser A«p Asn Thr Vel Thr Leu Glu 

1575 1584 1593 1602 16U 

TTG CCC CCT CTG TCC GTT ACT GCA ATA TIG CTC AAG GCC CGG CCC TAA 3* 
Leu pro Pro Leu Ser Val Thr Ala lie Leu Leu Lye Ala Arg Pro ••♦ 
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5 ' CTG ATC TCT GTG OAA ATA TfC GGA AAG ACC TTC ACA ^AT. QGA AGA TTC GTT CTC 



Val lie Cys Val Clu rle Phe Gly Ly* *mr Phe Arg ulu Cly Arg Phe Val Lou 

63 72 81 90 99 108 

AAA GAG AAA AAC TTC ACA CTT GAG TIC OCG CTC GAG AAG ATA CAC CTT OQC TOG 



Lys Glu Lys Asn Phe Thr val Glu Phe Ala Val Clu Lys lid His Leu Gly Trp 

117 126 135 144 153 162 

AAG ATC TCC GGC AGG GTG AAG GGA AGT CCC GGA AGG CTT GAG OTT CTT CGA ACG 



Lys lie Ser Gly Arg Val Lys Gly Ser Pro Gly Arg Leu Glu Val Leu Arg Thr 

171 180 189 198 207 216 

AAA GCA CCG GAA AAG GTA CTT GTG AAC AAC TOG CAG TCC TOG GGA CCG TCC AGG 



Lys Ala Pro Glu Lys Val Leu Val Asn Asn Trp Gin Ser Trp Gly Pro Cys Arg 

225 234 243 252 261 270 

GTG CTC GAT GCC TIT TCT TTC AAA CCA CCT GAA ATA GAT OCXS AAC TOG AGA TAC 

Val Val Asp Ala Phe Ser Phe Lys Pro Pro Glu lie Aap Pro Asn Trp Arg Tyr 

279 288 297 306 315 324 

ACC GCT TOG GTG GTG CCC GAT GTA CTT GAA AGG AAC CTC CAG AGC GAC TAT TTC 

Vrxr Ala Ser Val Val Pro Asp Val Leu Glu Arg Asn Leu Gin Ser Asp Tyr Phe 

333 342 351 360 369 378 

GTC GCT GAA GAA GGA AAA GTG TAC GGT TTT CTG AGT TOG AAA ATC GCA CAT CCT 



Val Ala Glu Glu Gly Lys Val Tyr Gly Phe Leu Ser Ser Lye lie Ala His Pro 



387 396 405 414 423 432 

TTC TTC GCT GTG GAA GAT GGG GAA CTT GTG GCA TAC CTC GAA TAT TTC GAT GTC 



Phe Phe Ala Val Glu Asp Gly Glu Leu Val Ala Tyr Leu Clu Tyr Pfce Asp Val 


GAG TTC 


441 

GAC 


GAC 


TTT 


450 
GTT 


CCT 


CTT 


459 

GAA CCT CTC 


468 

GTT 


477 

GTA CTC GAG GAT 


CCC 


486 

AAC 


Glu Phe 


Anp 


Asp 


Phe 


Val 


Pro 


Leu 


Glu Pro Leu 


Val 


Val Leu Glu Asp 


Pro 


Asn 


ACA CCC 


49S 

CTT 


CTT 


CTG 


S04 
GAG 


AAA 


TAC 


513 

GCC GAA CTC 


522 

GTC 


531 

GGA ATC GAA AAC 


AAC 


540 

GCC 


Thr Pro 


l«u 




Leu 


Clu 


Lys 


Tyr 


Ala Glu Leu 


Vol 


Gly IleL Glu Asn 


Asn 


Ala 


AGA GTT 


b49 

CCA 


AAA 


CAC 


556 
ACA 


CCC 


ACT 


567 

GGA TOG TGC 


576 
AGC 


585 

TGC TAC CAT TAC 


TTC 


594 
CTT 


Arg Vai 


Pro 


Lyu 


tut; 


'rtir 


i'ro 


The 


Gly rrp cys 




Tip Tyr Mis Tyr 


PI 


Leu 
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'Hierrotcx/a /wru C J/na Alpha -QAlacLosiaaoe 
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603 612 621 630 639 646 

oat ere xrrc tog gaa gag acc ore AAG aac ctc aag ctc oco aac aat ttc ccc 



Asp Leu TVur Trp Glu Glu Ttir Leu Lyn Asn Leu Lys Leu Ala Lys Aon Phe Pro 

657 666 675 684 693 702 

TTC GAG GTC TTC GAG ATA GAC GAC CCC TAG GAA AAC GAG ATA GGT GAG TOG CTC 

Phe Glu Val Phe Gin lie Asp Asp Ala Tyx Glu Lys Asp lie Gly Asp Trp Leu 

711 720 729 738 747 756 

GTO ACA AGA GGA GAC TIT CCA TOG GTG GAA GAG ATG GCA AAA GTT ATA GCG GAA 

Val Ttir Arg Gly Asp Phe Pro Ser Val Glu Glu Met Ala Lys Val He Ala Glu 

765 774 783 792 801 810 

AAC GGT TIC ATC CCG GGC ATA TGG ACC GCC CCG TTC ACT GTT TCT GAA AOC TCC 

Asn Gly Phe He Pro Gly He Trp Tnr Ala Pro Pbe Ser Val Ser Glu Tfar Ser 

819 828 837 846 855 864 

GAT GTA TTC AAC GAA CAT CCO GAC TGG GIA CTC AAG GAA AAC GGA GAG COG AAG 

Asp Val Phe Asn Glu His Pro Asp Trp Val Val Lys Glu Asn Gly Glu Pro Lys 

873 882 891 900 909 918 

ATG OCT TAG AGA AAC TOG AAC AAA AAG KTfi T»C GCC CTC GAT CTT TOG AAA GAT 

Met Ala Tyr Arg Asn Trp Asn Lys Lys He Tyr Ala Leu Asp Leu Ser Lys Asp 

927 936 945 954 963 972 

GAG GTT CTC AAC TGG CTT TTC GAT CTC TTC TCA TCT CTC AGA AAG ATG GGC TRC 

Glu Val Leu Asn Trp Leu Phe Asp Leu Phe Ser Ser Leu Arg Lys Met Gly Tyr 

981 990 999 1008 1017 1026 

AOG TAC TTC AAG ATC GAC TTT CTC TTC CCG GGT GCC CTT CCA GGA GAA AGA AAA 





Tyr Phe Lys 


He Asp 


Phe 


Leu Phe Ala Gly Ala Val Pro Gly Glu Arg Lys 




1035 


1044 




1053 1062 


1071 


1080 


AAG 


AAC ATA ACA 


CCA ATT 


CAG 


GCG TTC AGA AAA GGG 


ATT GAG ACG ATC 


AGA AAA 


Lys 


Asn He Ttir 


Pro He 


Gin 


Ala Phe Arg Lys Gly 


lie Glu Tbr He 


Arg Lys 




1089 


1O90 




1107 1116 


1125 


1134 


GCC 


GTG GGA GAA 


GAT TCT 


TTC 


ATC CTC OCA TOC GGC 


TCT CCC CTT CTT 


CCC GCA 


Ala 


Val Gly Glu Asp Ser 


Phe 


He Leu Gly Cya Gly 


S*st Pro Leu Leu 


Pro Ala 




1143 


11S2 




1161 U70 


1179 


1188 


CTC 


GGA TOC GTC 


GAC GOG 


ATG 


AGO ATA GGA OCT GAC 


ACT GCG CCG TTC 


TGG GGA 



Val Gly Cys Val Asp Gly Met Arg He Gly Pro Asp TVr Ala Pro Phe Trp Gly 

Figure 10 (Continued) 



WO 97/25417 




PCT/US97/00092 



19/33 



7tt#-r «>( ocja atari Lima Alpha -o/i 1 net oiiidaoc 

1137 1206 121S 1224 3-233 1242 

GAA CAT ATA GAA GAC AAC OCA OCT CCC OCT OCA AGA TOG GOG CTC AGA AAC OCC 



Glu His He Glu Aap Asn Gly aIa Pro Ala Ala Arg Trp Ala Leu Arg Asn Ala 

12S1 1260 1269 1278 1287 1296 

ATA ACG AGO TAC TIC ATC CAC GAC AOG TTC TOG CTC AAC GAC COC GAC TOT CTC 

He Thr Arg Tyr Pho Mac His Asp Arg Phe Trp Leu Asn Asp Pro Asp Cys Leu 

1305 1314 1323 1332 1341 1350 

AXA CTC AGA GAG GAG AAA ACG GAT CTC ACA CAG AAG GAA AAG GAG CTC TAC TOG 



He Leu Arg Glu Glu Lys Thr Asp Leu Thr Gin Lye Glu Lys Glu Leu Tyr Sex 

1359 1368 1377 1386 1395 1404 

TAC ACG TOT GGA GTG CTC GAC AAC ATC ATC ATA GAA AGO GAT GAT CTC TCG CTC 



Tyr Ofar Cys Cly Vol Leu Asp Asn Met He He Glu Sex Asp Asp Leu Sex Leu 

1413 1422 1431 1440 1449 145S 

GTC AGA GAT CAT GGA AAA AAG GTT CTC AAA GAA ACG CTC GAA CTC CTC GOT GGA 

Val Arg Asp His Gly Lys Lys V&l Leu Lys Glu Tbr Leu Glu Leu Leu Gly Gly 

1467 1476 1485 1494 1503 1512 

AGA CCA COG GTT GAA AAC ATC ATC TCG GAG GAT CTC AGA TAC GAC ATC GTC TCG 

Arg Pro Arg Val Gin Asn He Hat Ser Glu Asp Leu Arg Tyr Glu He Vol Ser 

1521 1530 1539 1548 1557 1566 

TCT GGC ACT CTC TCA OCA AAC GTC AAG ATC GTC GTC GAT CTC AAC AGC AGA GAG 

Ser Gly Thr Leu Ser Gly Asn Val Lys He Val Val Asp Leu Asn Sex Arg Glu 

1S75 1584 1593 1602 1611 1620 

TAC CAC CTC GAA AAA GAA GGA AAG TCC TCC CTC AAA AAA AGA C7IC GTC AAA AGA 

Tyr His Leu Glu Lys Glu Gly Lys Ser Ser Leu Lys Lys Arg VaJL Val Lys Arg 

1629 1638 1647 1656 1665 

GAA GAC GGA AGA AAC TTC TAC TTC TAC GAA CAC OCT GAG AGA GAA TCA 3 • 

Glu Asp Gly Arg Asn Phc Tyr Phe Tyr Glu Glu Gly Glu Any Glu 
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9 18 27 36 45 54 

5 ' ATG GGG ATT GGT GGC GAC GAC TCC TGG AGC CCG TCA GTA TCG GCG GAA TTC CTT 



Mec Gly He Gly Gly Asp Asp Ser Trp Ser Pro Ser Val Ser Ala Glu Phe Leu 

63 72 81 90 99 108 

TTA TTG ATC GTT GAG CTC TCT TTC GTT CTC TTT GCA ACT GAC GAC TTC CTG AAA 



Leu Uu lie Val Glu Leu Ser Phe Val Leu Phe Ala Ser Asp Glu Phe Val Lys 

117 126 135 144 153 162 

GTG GAA AAC GGA AAA TTC GCT CTG AAC GGA AAA GAA TTC AGA TTC ATT GGA AGC 



Val Glu Asn Gly Lys Phe Ala Leu Asn Gly Lys Glu Phe Arg Phe lie Gly Ser 

171 180 189 198 207 216 

AAC AAC TAG TAC ATG CAC TAC AAG AGC AAC GGA ATG ATA GAC ACT GTT CTG GAG 



Asn Asn Tyr Tyr Mec His Tyr Lys Ser Asn Gly Met He Asp Ser Val Leu Glu 

225 234 243 252 261 270 

ACT GCC AGA GAC ATG GGT ATA AAG GTC CTC AGA ATC TGG GGT TTC CTC GAC GGG 



Ser Ala Arg Asp Met Gly lie Lys Val lieu Arg lie Trp Gly Phe Leu Asp Gly 

279 288 297 306 315 324 

GAG AGT TAC TGC AGA GAC AAG AAC ACC TAC ATG CAT CCT GAG CCC GGT GTT TTC 



Glu Ser Tyr Cys Arg Asp Lys Asn Thr Tyr Met His Pro Glu Pro Gly Val Phe 

333 342 351 360 369 378 

GGG CTG CCA GAA GGA ATA TCO AAC GCC CAO AGC GGT TTC GAA AGA CTC GAC TAC 



Gly Val Pro Glu Gly lie Ser Asn Ala Gin Ser Gly Phe Glu Arg Leu Asp Tyr 

387 396 405 414 423 432 

ACA CTT GCG AAA GCG AAA GAA CTC GGT ATA AAA CTT GTC ATT GTT CTT GTG AAC 



Thr Val Ale Lys Ala Lys Glu Leu Gly lie Lys Leu Val lie Val Leu Val Asn 

441 450 459 468 477 486 

AAC TCG GAC GAC TTC GCT GGA ATG AAC CAG TAC CTG AGG TGG TTT GGA GGA ACC 



Asn Trp Asp Asp Phe Gly Gly Met Asn Gin Tyr Val Arg Trp Phe Gly Gly Thr 

495 504 513 522 531 540 

CAT CAC GAC GAT TTC TAC AGA GAT GAG AAG ATC AAA GAA GAG TAC AAA AAG TAC 



His His Asp Asp Phe Tyr Arg Asp Glu Lys He Lys Glu Glu Tyr Lys Lys Tyr 

Figure 11 





WO 97/25417 PCT/US97/00092 

21/33 



Thermotoga 



Mrltiat P-Maatatet (***ss*f- (coatiautd 



I) 



549 558 567 576 585 594 

GTC TCC TTT CTC GTA AAC CAT GTC AAT ACC TAC ACG GGA GTT CCT TAC AGO GAA 

Val Ser Phe Uu Val Aan His Val Asn Thr Tyr Thr Gly Val Pro Tyr Arg Glu 

603 612 621 630 639 648 

GAG CCC ACC ATC ATG GCC TGG GAG CTT GCA AAC GAA CCG CGC TGT GAG ACG GAC 

Glu Pro Thr lie Met Ala Trp Glu Leu Ale Aan Glu Pro Arg Cys Glu Thr Asp 

657 666 675 684 693 702 

AAA TCG GGG AAC ACG CTC GTT GAG TGG GTG AAG GAG ATG AGC TCC TAC ATA AAG 

Lys Ser Gly Asn Thr Leu Val Glu Trp Val Lys Glu Met Ser Ser Tyr He Lys 

711 720 729 738 747 756 

AGT CTG GAT CCC AAC CAC CTC GTG GCT GTG GGG GAC GAA GGA TTC TTC AGC AAC 

Ser Leu Asp Pro Asn His Leu Val Ala Val Gly Asp Glu Gly Phe Phe Ser Asn 

765 774 783 792 801 810 

TAC GAA GGA TTC AAA CCT TAC GGT GGA GAA GCC GAG TGG GCC TAC AAC GGC TGG 



Tyr 



Glu Gly Phe Lye Pro Tyr Gly Gly Glu Ala Glu Trp Ala Tyr Aan Gly Trp 



819 828 837 846 855 864 

TCC GGT GTT GAC TGG AAG AAG CTC CTT TCG ATA GAG ACG GTG GAC TTC GGC ACG 

Ser Gly Val Asp Trp Lys Lys Leu Leu Ser He Glu Thr Val Asp Phe Gly Thr 

873 882 891 900 909 918 

TTC CAC CTC TAT CCG TCC CAC TOG GGT GTC AGT CCA GAG AAC TAT GCC CAG TGG 

Phe His Leu Tyr Pro Ser His Trp Gly Val Ser Pro Glu Asn Tyr Ala Gin Trp 

927 936 945 954 963 972 

GGA GCG AAG TGG ATA GAA GAC CAC ATA AAG ATC GCA AAA GAG ATC GGA AAA CCC 

Gly Ala Lys Trp He Glu Asp His He Lys He Ala Lys Glu He Gly Lys Pro 

981 990 999 1008 1017 1026 

GTT GTT CTG GAA GAA TAT GGA ATT CCA AAG AGT GCG CCA GTT AAC AGA ACG GCC 

Val Val Leu Glu Glu Tyr Gly He Pro Lys Ser Ala Pro Val Asn Arg Thr Ala 

1035 1044 1053 1062 1071 1080 

ATC TAC AGA CTC TGG AAC GAT CTG GTC TAC GAT CTC GGT GGA GAT GGA GCG ATG 

lie Tyr Arg Leu Trp Asn Asp Leu Val Tyr Asp Leu Gly Gly Asp Gly Ala Met 

Figure 11 (Continued) 
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1089 1098 1107 1116 1125 1134 

TTC TGG ATG CTC GCG GGA ATC GGO GAA GGT TCG GAC AGA GAC GAG AGA GGG TAC 

Phe Trp Met Leu Ala Gly He Gly Glu Gly Ser Asp Arg Asp Glu Axg Gly Tyr 

1143 1152 1161 1170 1179 1188 

TAT CCG GAC TAC GAC GGT TTC AGA ATA GTG AAC GAC GAC AGT CCA GAA GCG GAA 

Tyr Pro Asp Tyr Asp Gly Phe Arg He Val Asn Asp Asp Ser Pro Glu Ala Glu 

1197 1206 1215 1224 1233 1242 

CTG ATA AGA GAA TAC GCG AAG CTO TTC AAC ACA GGT GAA GAC ATA AGA GAA GAC 

Leu lie Arg Glu Tyr Ala Lya Leu Phe Asn Thr Gly Glu Asp He Arg Glu Asp 

1251 1260 1269 1278 1287 1296 

ACC TGC TCT TTC ATC CTT CCA AAA GAC GGC ATQ GAG ATC AAA AAG ACC GTG GAA 

Thr Cys Ser Phe He Leu Pro Lys Asp Gly Met Glu He Lys Lys Thr Val Glu 

1305 1314 1323 1332 1341 1350 

GTG AGG GCT GGT GTT TTC GAC TAC AGC AAC ACQ TTT GAA AAG TTG TCT GTC AAA 

Val Arg Ala Gly Val Phe Asp Tyr Ser Asn Thr Phe Glu Lys Leu Ser Val Lys 

1359 1368 1377 1386 1395 1404 

GTC GAA GAT CTG GTT TTT GAA AAT GAG ATA GAG CAT CTC GGA TAC GGA ATT TAC 

Val Glu Asp Leu Val Phe Glu Asn Glu He Glu His Leu Gly Tyr Gly He Tyr 

1413 1422 1431 1440 1449 1458 

GGC TTT GAT CTC GAC ACA ACC CGG ATC CCG GAT GGA GAA CAT GAA ATG TTC CTT 

Gly Phe Asp Leu Asp Thr Thr Arg He Pro Asp Gly Glu His Glu Met Phe Leu 

1467 1476 1485 1494 1503 1512 

GAA GGC CAC TTT CAG GGA AAA ACG GTG AAA GAC TCT ATC AAA GCG AAA GTG GTG 

Glu Gly His Phe Gin Gly Lys Thr Val Lys Asp Ser He Lys Ala Lys Val Val 

1521 1S30 1539 1548 1557 1566 

AAC GAA GGA CGG TAC GTG CTC GCA GAG GAA GTT GAT TTT TCC TCT CCA GAA GAG 

Asn Glu Ala Arg Tyr Val Leu Ala Glu Glu Val Asp Phe Ser Ser Pro Glu Glu 

1575 1584 1593 1602 1611 1620 

GTG AAA AAC TGG TGG AAC AGC GGA ACC TOG CAG GCA GAG TTC GGC TCA CCT GAC 

Val Lys Asn Trp Trp Asn Ser Gly Thr Trp Gin Ala Glu Phe Gly Ser Pro Asp 

Figure 11 (Continued) 
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1629 



1638 



1647 



16S6 



1665 



1674 



ATT GAA TGG AAC GGT GAG GTG GGA AAT GGA GCA CTG CAG CTG AAC GTG AAA CTC 

lie Glu Trp Asn Oly Glu Val Gly Asn Gly Ala Leu Gin Leu Asn Val Lys Lou 

1683 1692 1701 1710 1719 1728 

CCC GGA AAG AGC GAC TGG GAA GAA GTG AGA GTA GCA AGG AAG TTC GAA AGA CTC 

Pro Gly Lys Ser Asp Trp Glu Glu Val Arg Val Ala Arg Lys Phe Glu Arg Leu 

1737 1746 1755 1764 177* 1782 

TCA GAA TCT GAG ATC CTC GAG TAC GAC ATC TAC ATT CCA AAC GTC GAG GGA CTC 

Ser Glu Cys Glu lie Leu Glu Tyr Asp lie Tyr lie Pro Asn Val Glu Gly Leu 

1791 1800 1809 1818 1827 1836 

AAG GGA AGG TTG AGG CCG TAC GCG GTT CTG AAC CCC GGC TGG GTG AAG ATA GGC 

Lys Gly Arg Leu Arg Pro Tyr Ala Val Leu Asn Pro Gly Trp Val Lys lie Gly 

1845 1854 1863 1872 1881 1890 

CTC GAC ATG AAC AAC GCG AAC GTG GAA ACT GCG GAG ATC ATC ACT TTC GGC GGA 

Leu Asp Met Asn Asn Ala Asn Val Glu Ser Ala Glu lie lie Thr Pbe Gly Gly 

1899 1908 1917 1926 1935 1944 

AAA GAG TAC AGA AGA TTC CAT GTA AGA ATT GAG TTC GAC AGA ACA GCG GGG GTG 

Lys Glu Tyr Arg Arg Phe His Val Arg lie Glu Phe Asp Arg Thr Ala Gly Val 

1953 1962 1971 1980 1989 1998 

AAA GAA CTT CAC ATA GGA GTT GTC GGT GAT CAT CTG AGG TAC GAT GGA CCG ATT 

Lys Glu Leu His lie Gly Val Val Gly Asp His Leu Arg Tyr Asp Gly Pro lie 

2007 2016 2025 2034 2043 

TTC ATC GAT AAT GTG AGA CTT TAT AAA AGA ACA GGA GGT ATG TGA 3' 

Phe lie Asp Asn Val Arg Leu Tyr Lys Arg Thr Gly Gly Met •* * 
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ATG 


CTA 


9 

CCA 


GAA 


GAG 


18 
TTC 


CTA 


TGG 


27 

GGC GTT GGG 


36 

CAG TCA GGC 


45 
TTT CAG 


54 

TTC GAA 


Hot 


Leu 


Pro 


Glu 


Glu 


Phe 


Leu 


Trp 


Gly Val Gly Gin Ser Gly 


Phe Gin 


Phe Glu 


ATG 


GGC 


63 
GAC 


AAG 


CTC 


72 
AGG 


AGG 


CAC 


81 

ATC GAT CCA 


90 

AAT ACC GAC 


99 
TGG TGG 


108 
AAG TGG 


Mot 


Gly 


Asp 


Lys 


Leu 


Arg 


Arg 


His 


He Asp Pro 


Asn Thr Asp 


Trp Trp 


Lys Trp 


GTT 


CGC 


117 
GAT 


CCT 


TTC 


126 
AAC 


ATA 


AAA 


135 

AAG GAG CTT 


144 

GTG AGT GGG 


153 

GAC CTT 


162 
CCC GAG 


val 


Arg 


ASP 


Pro 


Pbe 


Asn 


11* 


Lys 


Lys Glu Leu 


Val Ser Gly Asp Leu 


Pro Glu 


GAC 


GGC 


171 
ATC 


AAC 


AAC 


180 
TAC 


GAA 


CTT 


189 

TTT GAA AAC 


198 

GAT CAC AAG 


207 

CTC GCT 


216 
AAA GGC 


Asp Gly 


He 


Asn 


Asn Tyr 


Glu 


Leu 


Phe Glu Asn 


Asp His Lys 


Leu Ala 


Lys Gly 


CTT 


GGA 


225 
CTC 


AAC 


GCA 


234 

TAC 


AGG 


ATT 


243 

GGA ATA GAG 


252 

TGG AGC AGA 


261 

ATC TTT 


270 
CCC TGG 


Leu 


Gly 


Leu 


Asn 


Ala 


Tyr 


Arg 


He Gly lie Glu 


Trp Ser Arg 


He Phe 


Pro Trp 


CCG 


ACQ 


279 

TGG ACG 


GTC 


288 
GAT 


ACC 


GAG 


297 

GTC GAG TTC 


306 

GAC ACT TAC 


315 

GCT TTA 


324 
GTA AAG 


Pro 


Thr 


Trp 


Thr 


Val 


Asp 


Thr 


GlU 


Val Glu Phe Asp Thr Tyr Gly Leu Val Lys 


GAC 


GTT 


333 
AAG 


ATA 


GAC 


342 
AAG 


TCC 


ACC 


351 

CTT GCT GAA 


360 

CTC GAC AGG 


369 

CTG GCC 


378 
AAC AAG 


Asp 


Val 


Lys 


He 


Asp 


Lys 


Ser 


Thr 


Leu Ala Glu 


Leu Asp Arg 


Leu Ala 


Asn Lys 


GAG 


GAG 


387 
GTA 


ATG 


TAC 


396 
TAC 


AGG 


CGC 


405 

GTT ATT CAG 


414 

CAT TTG AGG 


423 

GAG CTC 


432 
GGC TTC 


Glu 


Glu 


Val 


Met 


Tyr 


Tyr 


Arg 


Arg 


Val He Gin 


His Leu Arg 


Glu Leu 


Gly Phe 


AAG 


GTC 


441 

TC 


GTT 


AAC 


450 

CTC 


AAC 


CAC 


459 

TTC ACG CTT 


468 

CCA ATA TGG 


477 

CTC CAC 


486 

GAC CCG 


Lys 


Val 


Phe 


Val 


Asn 


Leu 


Asn 


His 


Phe Thr Leu 


Pro He Trp 


Leu His 


Asp Pro 


ATA 


GTG 


495 

GCA 


AGG 


GAG 


504 
AAG 


GCC 


CTC 


513 

ACA AAC GAC 


522 

AGA ATC GGC 


531 

TGG GTC 


540 

TCC CAG 


He 


Val 


Ala 


Arg 


Glu 


Lys 


Ala 


Leu 


; Thr Asn Asp Arg He Gly Trp Val 


Ser Gin 
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XX la ^-Mnaoillti* (C30B1) { continued) 

549 558 567 576 585 594 

AGO ACA GTT GTT GAG TTT GCC AAG TAT OCT GCT TAC ATC GCC CAT GCG CTC GGA 

Arg Thr V«l Val Glu Phe Ala Lys Tyr Ala Ala Tyr lie Ala His Ala Leu Gly 

603 612 621 630 639 64B 

GAC CTC GTO GAC ACA TGG AGC ACC TTC AAC GAA CCT ATC GTA GTT GTO GAG CTC 

Asp Leu Val Asp Thr Trp Ser Thr Phe Asn Glu Pro Met Val Val Val Glu Leu 

6S7 666 67S 684 693 702 

GGC TAC CTC GCC CCC TAC TCA GGA TTT CCC CCO GGA GTC ATG AAC CCC GAG GCC 

Gly Tyr Leu Ala Pro Tyr Ser Gly Phe Pro Pro Gly Val Met. Aan Pro Glu Ala 

711 720 729 738 747 756 

GCG AAG CTG GCG ATC CTC AAC ATG ATA AAC CCC CAC GCC TTG GCA TAT AAG ATC 

Ala Ly* Leu Ala lie Leu Aan Met lie Aan Ala His Ala Leu Ale Tyr Lys Met 

765 774 783 792 801 810 

ATA AAG AGG TTC GAC ACC AAG AAG GCC GAT GAO CAT AGC AAG TCC CCT GCG GAC 

lie Lys Arg Phe Asp Thr Lys Lys Ala Asp Glu Asp Ser Lys Ser Pro Ala Asp 

819 828 837 846 855 864 

GTT GGC ATA ATT TAC AAC AAC ATC GGT GTT GCC TAC CCT AAA GAC CCT AAC GAT 

val Gly lie lie Tyr Aan Asn He Gly Val Ala Tyr Pro Lys Asp Pro Asn Asp 

9TS 882 891 900 909 918 

CCC AAG GAC GTT AAA GCA GCC GAA AAC GAC AAC TAC TTC CAC AGC GGA CTG TTC 

Pro lyl fll val Lys Ala Ala Glu Asn Asp Asn Tyr Phe His Ser Gly Leu Phe 

927 936 945 954 963 972 

TTT GAT GCC ATC CAC AAG GGT AAG CTC AAC ATA GAG TTC CAC GGC GAA AAC TTT 

Phe Asi Ala He His Lys Gly Lys Leu Asn He Glu Phe Asp Gly Glu Asn Phe 

98l 990 999 1008 1017 1026 

OTA AAA GTT AGA CAC CTA AAA GCC AAT GAC TGG ATA GGC CTC AAC TAC TAC ACC 



Val Lys Val Axo 



His Leu Lys Gly Asn Asp Trp lie Gly Leu Asn Tyr Tyr Thr 



10 35 1044 1053 1062 1071 . 1080 

CGC GAC GTT GTT AGA TAT TCG OAG CCC AAG TTC CCA ACT ATA CCC CTC ATA TCC 

Arg Glu val Val Are Tyr Ser Glu Pro Lys Phe Pro Ser He Pro Leu He Ser 

Figure 12 (Continued) 
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JL1PII la ^-M&aoiidAit <<30B1) (continued) 

1089 1098 1107 1116 1125 1134 

TTC AAG GGC GTT CCC AAC TAC GGC TAC TCC TGC AGG CCC GGC ACQ ACC TCC CCC 

Phe Lys Gly Val Pro Asn Tyr Gly Tyr Ser Cys Arg Pro Cly Thr Thr Ser Ala 

1143 1152 1161 H70 1179 1189 

GAT GGC ATG CCC GTC AGC GAT ATC GGC TGG GAA GTC TAT CCC CAG GGA ATC TAC 

Asp Gly Mec Pro Val Ser Asp lie Gly Trp Glu Val Tyr Pro Gin Gly lie Tyr 

HS7 1206 1215 1224 1233 1242 

GAC TCG ATA GTC GAG GCC ACC AAO TAC ACT GTT CCT GTT TAC GTC ACC GAG AAC 

Asp Ser lie Val Glu Ala Thr Lys Tyr Ser Val Pro Val Tyr Val Thr Glu Asn 

1251 1260 1269 1278 1287 1296 

GOT GTT GCG GAT TCC GCG GAC ACG CTG AGG CCA TAC TAC ATA GTC AGC CAC GTC 



Gly 



Val Ala Asp Ser Ala Asp Thr Leu Arg Pro Tyr Tyr lie Val Ser His Val 



1305 1314 1323 1332 1341 1350 

TCA AAG ATA GAG GAA GCC ATT GAG AAT GGA TAC CCC GTA AAA GGC TAC ATG TAC 

Ser Lys He Glu Glu Ala He Glu Asn Gly Tyr Pro Val Lys Gly Tyr Met Tyr 

1359 1368 1377 1386 1395 1404 

TGG GCC CTT ACG GAT AAC TAC GAG TGG GCC CTC GGC TTC AGC ATG AGG TTT GGT 

Trp Ala Leu Thr Asp Asn Tyr Glu Trp Ala Leu Gly Phe Ser Met Arg Phe Gly 

1413 1422 1431 1440 1449 1458 

CTC TAC AAG GTC GAC CTC ATC TCC AAG GAG AGG ATC CCG AGG GAG AGA AGC GTT 

Leu Tyr Lys Val Asp Leu He Ser Lys Glu Arg He Pro Arg Glu Arg Ser Val 

1467 1476 1485 1494 1S03 1512 

GAG ATA TAT CCC AGG ATA CTG CAG TCC AAC GGT GTT CCT AAG GAT ATC AAA GAG 



Glu He Tyr Arg Arg He Val Gin Ser Asn Gly Val Pro Lys Asp He Lys 

1521 1530 1539 

GAG TTC CTG AAG GGT GAG GAG AAA TCA 3* 



Glu 



Glu Phe Leu Lys Gly Glu Glu Lys 
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OC1/4V Ssdoolncmama* (330P1) 



ATG 


GTA 


9 

GAA 


AGA 


18 

CAC TTC 


AGA 


TAT 


27 
GTT 


CTT ACT 


36 
TGC 


ACC 


CTG 


45 
TTT 


CTT 


GTT 


54 
ATC 


Met 


Val 


Glu 


Arg 


His Phe 


Arg Tyr Val 


Leu He 


Cys 


Thr 


Leu 


Phe 


Leu 


Val 


Met 


CTC 


CTA 


63 
ATC 


TCA 


72 

TCC ACT 


CAG 


TGT 


81 
GGA 


AAA AAT 


90 
GAA 


CCA 


AAC 


99 
AAA 


AGA 


GTG 


108 
AAT 


Leu 


Leu 


He 


Ser 


Ser Thr Gin Cys Gly Lys Asn Glu Pro Asn 


Lys 


Arg 


Val 


Asn 


AGC 


ATG 


117 
GAA 


CAG 


126 
TCA GTT 


GCT 


GAA 


135 
ACT 


GAT AGC 


144 

AAC 


TCA 


GCA 


153 
TTT 


GAA 


TAC 


162 
AAC 


ser 


Met 


Glu 


Gin 


Ser Val 


Ala 


Glu 


Sex Asp Ser 


Asn 


Sex 


Ala 


Phe 


Glu 


Tyr 


Asn 


AAA 


ATG 


171 
GTA 


GGT 


180 
AAA GGA 


GTA 


AAT 


189 
ATT 


GGA AAT 


198 
GCT 


TTA 


GAA 


207 
GCT 


CCT 


TTC 


216 
GAA 


Lys 


Met: 


Val 


Gly 


Lys Gly Val 


Asn He Gly Asn Ala Leu Glu Ala 


Pro 


Phe 


Glu 


GGA 


GCT 


225 
TGG 


GGA 


234 
GTA AGA 


ATT 


GAG 


243 

GAT 


GAA TAT 


252 

iirr 


GAG 


ATA 


261 

ATA AAG 


AAA 


270 
AGG 


Oly Ala Trp Gly Val Arg 


He Glu Asp Glu Tyr Phe Glu 


He 


He Lys 


i-ys 


Arg 


GGA 


TTT 


279 
GAT 


TCT 


288 
GTT AGG 


ATT 


CCC 


297 
ATA 


AGA TGG 


306 
TCA 


GCA 


CAT 


315 
ATA 


TCC 


GAA 


324 
AAG 


Gly 


Phe 


Asp 


Ser 


Val Arg 


He 


Pro 


He 


Arg Trp 


Ser 


Ala 


His 


He 


Ser 


Glu 


Lys 


CCA 


CCA 


333 
TAT 


GAT 


342 

ATT GAC 


AGO 


AAT 


351 
TTC 


CTC GAA 


360 
AGA 


GTT 


AAC 


369 
CAT 


GTT 


GTC 


378 
GAT 


Pro 


Pro 


Tyr 


Asp 


lie Asp 


Arg 


Asn 


Phe 


Leu Glu 


Arg 


Val 


Asn 


His 


Val 


Val 


Asp 


AGG 


GCT 


387 

CTT 


GAG 


396 
AAT AAT 


TTA 


ACA 


405 

GTA 


ATC ATC 


414 

AAT 


ACG 


CAC 


423 

CAT 


TIT 


GAA 


432 
GAA 


Arg 


Ala 


Leu 


Glu 


Asn Asn 


Leu 


Thr 


Val 


He lie 


Asn 


Thr 


His 


His 


Phe 


Glu 


Glu 


CTC 


TAT 


441 

CAA 


GAA 


450 

CCG GAT 


AAA 


459 

TAC GGC 


GAT GTT 


468 

TTG 


GTG 


GAA 


477 
ATT 


TGG 


AGA 


486 

CAG 


Lou 


Tyr 


Gin 


Glu 


Pro Asp 


Lys 


Tyr Gly Asp Val 


Leu 


Val 


Glu 


He Trp Arg Gin 


ATT 


GCA 


495 
AAA 


TTC 


504 
TTT AAA 


GAT 


TAC 


513 
CCG 


GAA AAT 


522 
CTG 


TTC 


TTT 


531 
GAA 


ATC 


TAC 


540 
AAC 


lie 


Ala 


Lys 


Pfee 


Phe Lys 


Asp 


Tyr 


Pro 


Glu Asn 


Leu 


Phe 


Phe 


Glu 


He 


Tyr 


Asn 
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OC1/4V »doaluo*»a.. (330M) (eo*C±»ti.d> 

549 558 567 576 585 594 

GAG CCT 



GCT CAG AAC TTC ACA GCT GAA AAA TGG AAC GCA CTT TAT CCA AAA GTC 



Glu Pro aII oln Asn Leu Thr Ala Glu Lys Trp Asn Ala Leu Tyr Pro Lys VI 

603 612 621 630 639 648 

CTC AAA GTT ATC AGG GAG AGC AAT CCA ACC CGG ATT GTC ATT ATC GAT OCT CCA 

Leu Lys val lie Arg Glu Ser Juin Pro Thr Arg lie Val He He Asp Ala Pro 

657 666 675 684 693 702 

AAC TGG GCA CAC TAT AGC GCA GTG AGA AGT CTA AAA TTA GTC AAC GAC CGC 

Asn Trp All His Tyr Ser Ala Val Arg Ser Leu Ly. Leu Val Asn Asp Lys Arg 

T11 720 729 738 747 756 

ATC ATT GTT TCC TTC CAT TAC TAC GAA CCT TTC AAA TTC ACA CAT CAG GCT GCC. 

III III Val ier III HI Ty^ Tyr III Pro Phe Ly. Ph. Thr Hi. Gin Gly Ala 

76S 774 783 792 801 810 

qjj^ <roo GTT AAT CCC ATC CCA CCT GTT AGG GTT AAG TGG AAT GQC GAG 



III Ttp Val Asn Pro 11. Pro Pro Val Arg Val Ly. Trp Aan Gly Glu Glu Trp 

837 846 855 864 



819 828 

GAA 

Gin 111 Arg Ser His Ph. Ly. Tyr Val Ser Asp Trp Ala Ly. Gin 

900 909 918 



ATT AAC CAA ATC AGA AGT CAT TTC AAA TAC GTG ACT GAC TOG GCA *AG CAA 
Glu He Asn 



873 882 891 

AAT 



AAC CTA CCA ATC TTT CTT GGT GAA TTC GCT GCT TAT TCA AAA GCA GAC ATG 



Asp Ser 



Asn Asn Val Pro lie Pbe Leu Gly Glu Phe Gly Ala Tyr Ser Ly. Ala Asp Met 

92 7 936 945 954 963 972 

GAC TCA AGG GTT AAG TGG ACC GAA AGT GTG AGA AAA ATG GCG GAA GAA ^ 

Arg Val Lys Trp Thr Glu Ser Val Arg Ly* Mec Ala Glu Glu Phe Gly 

981 990 999 1008 1017 1026 

TTT TCA TAC GCG TAT TGG GAA TTT TGT GCA GGA TTT GGC ATA TAC GAT AGA TGG 

HI HI ryx Ala Tyr Trp Glu Phe Cya Ala Gly Ph. Gly He Tyr Asp Arg Trp 

l03 5 1044 1053 1062 1071 1080 

CAA AAC TOG ATC GAA CCA TTG GCA ACA GCT GTG GTT GGC ACA GGC ^A GAG 

III ell ~mI Trp III Glu Pro Leu Ala Thr Ala Val Val Gly Thr Gly Lys Glu 
TAA 3" 
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Tb«raotO0* MritiM PulXalanase <«OP3) 



ATG 


GAT 


CTT 
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AAG 


1 R 

J. 0 

GTG 
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GAT 
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297 
GCC 
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CTT 


TCT 
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351 
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AGA 


AAA 


378 
GAC 


Tyr 


Val 


Arg 


He 


Val 


Leu 


Ser 


Glu 


Ser 


Leu 


Lys 


GlU 


Glu 


Asp 


Leu 


Arg 


Lys 


Asp 


GTG 


GAA 


387 
CTG 


ATC 


ATA 


396 
GAA 


GOT 


TAC 


405 

AAA 


CCG 


GCA 


414 

AGA 


GTC 


ATC 


423 
ATG 


ATG 


GAG 


ATC 


Val 


Glu 
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He 


He Glu Gly Tyr 


Lys 


Pro Ala Arg Val 


He 


Met 


Met 


Glu 


He 


CTG 


GAC 


441 

GAC 


TAC 


TAT 


450 
TAC 


GAT 


GGA 


459 
GAG 


CTC 


GGA 


468 

GCC 


GTA 


TAT 


477 

TCT 


CCA 


GAG 


486 
AAG 


Leu 


Asp 


Asp Tyr Tyr Tyr Asp Gly Glu 


Leu Gly Ala 


Val 


Tyr 


Ser 


Pro 


Glu 


Lys 


ACG 


ATA 


495 

TTC 


AGA 


CTC 


504 

TGG 


TCC 


CCC 


513 
GTT 


TCT 
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522 
TGG 
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TTC 



Thr He Phe Arg Val Trp Ser Pro Val Ser Lys Trp Val Lys Val Leu Leu Phe 
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Thtxaotoga mar 



itim* Pullulanase (6093) (continued) 

549 558 567 576 585 594 

AAA AAC GGA GAA GAC ACA GAA CCG TAC CAG GTT GTG AAC ATG GAA TAC AAG GGA 



Lys 



Asn Gly Glu Asp Thr Glu Pro Tyr Gin Val Val Asn Met Glu Tyr Lys Gly 



603 612 621 630 639 648 

AAC GGG GTC TGG GAA GCG GTT GTT GAA GGC GAT CTC GAC GGA GTG TTC TAC CTC 

Asn Gly Val Trp Glu Ala Val Val Glu Gly Asp Leu Asp Gly Val Phe Tyr Leu 

657 666 675 684 693 702 

TAT CAG CTG GAA AAC TAC GGA AAG ATC AGA ACA ACC GTC GAT CCT TAT TCG AAA 

Tyr Gin Leu Glu Asn Tyr Gly Lys He Arg Thr Thr Val Asp Pro Tyr ser Lys 

711 720 729 738 747 756 

GCG GTT TAC GCA AAC AAC CAA GAG AGC GCC GTT GTG AAT CTT GCC AGO ACA AAC 

Ala Val Tyr Ala Asn Asn Gin Glu Ser Ala Val Val Asn Leu Ala Arg Thr Asn 

765 774 783 792 801 810 

CCA GAA GGA TOG GAA AAC GAC AGG GGA CCG AAA ATC GAA GGA TAC GAA GAC GCG 

Pro Glu Gly Trp Glu Asn Asp Arg Gly Pro Lys He Glu Gly Tyr Glu Asp Ala 

819 828 837 846 855 864 

ATA ATC TAT GAA ATA CAC ATA GCG GAC ATC ACA GGA CTC GAA AAC TCC GGG GTA 

He He Tyr Glu lie His He Ala Asp He Thr Gly Leu Glu Asn Ser Gly Val 

873 882 891 900 909 918 

AAA AAC AAA GGC CTC TAT CTC GGG CTC ACC GAA GAA AAC ACG AAA GGA CCG GGC 

Lys Asn Lys Gly Leu Tyr Leu Gly Leu Thr Glu Glu Asn Thr Lys Gly Pro Gly 

927 936 945 954 963 972 

GGT GTG ACA ACA GGC CTT TCG CAC CTT GTG GAA CTC GGT GTT ACA CAC GTT CAT 

Gly Val Thr Thr Gly Leu Ser His Leu Val Glu Leu Gly Val Thr His Val His 

981 990 999 100S 1017 1026 

ATA CTT CCT TTC TTT GAT TTC TAC ACA GGC GAC GAA CTC GAT AAA GAT TTC GAG 

He Leu Pro Phe Phe Asp Phe Tyr Thr Gly Asp Glu Leu Asp Lys Asp Phe Glu 

10 3S 1044 1053 1062 1071 1080 

AAG TAC TAC AAC TGG GOT TAC GAT CCT TAC CTG TTC ATG GTT CCG GAG GGC AGA 

Lys Tyr Tyr Aan Trp Gly Tyr Asp Pro Tyr Leu Phe Met Val Pro Glu Gly Arg 
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Th«r«otogt uriti«a 



<<OP3> 



(coatinatd) 



1089 



1098 



1107 



1116 



1125 



1134 



TAC TCA ACC GAT CCC AAA AAC CCA CAC ACG AGA ATC AGA GAA CTC AAA GAA ATG 



Tyr 


Ser Thr 


Asp 


Pro Lys 


Asn 


Pro His 


Thr 


Arg He 


Arg 


Glu Val 


Lys 


Glu Met 


GTC 


1143 
AAA GCC 


CTT 


1152 
CAC AAA 


CAC 
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GGT ATA 
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GTG ATT 


ATG 


1179 
GAC ATG 


GTG 
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TTC CCT 
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Gly He 


Gly Val He 
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Asp Met 


Val 


Phe Pro 


CAC 


1197 
ACC TAC 


GGT 


1206 
ATA GGC 


GAA 


1215 
CTC TCT 


GCG 


1224 
TTC GAT 


CAG 


1233 
ACG GTG 


CCG 


1242 
TAC TAC 


His 


Thr Tyr 


Gly 


He Gly Glu 


Leu Ser 


Ala 


Phe Asp 


Gin 


Thr Val 


Pro 


Tyr Tyr 


TPC 


1251 
TAC AGA 


ATC 


1260 
GAC AAG 


ACA 


1269 
GGT GCC 


TAT 


1278 
TTG AAC 


GAA 


1287 
AGC GGA 


TGT 


1296 
GGT AAC 


Phe 


Tyr Arg 


He Asp Lys 


Thr 


Gly Ala 


Tyr 


Leu Asa Glu 


Ser Gly Cys 


Gly Asn 


GTC 


1305 
ATC GCA 


AGC 


1314 
GAA AGA 
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ATG ATG 


AGA 


1332 
AAA TTC 


ATA 


1341 
GTC GAT 


ACC 
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GTC ACC 


VaX 


He Ala 


Ser 


Glu Arg 
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Met Met 


Arg 


Lys Phe 


He 


Val Asp 


Thr 


Val Thr 


TAC 


1359 
TGG GTA 


AAG 


1368 
GAG TAT 


CAC 


1377 
ATA GAC 


1386 
GGA TTC AGG 


TTC 


1395 
GAT CAG 


ATG 


1404 
GGT CTC 


Tyr Trp Val 


Lye 


Glu Tyr 


His 


He Asp Gly Phe Arg Phe Asp Gin Met 


Gly Leu 


ATC 


1413 
GAC AAA 


AAG 


1422 
ACA ATG 


CTC 


1431 
GAA GTC 


GAA 


1440 

AGA GCT 


CTT 


1449 

CAT AAA 


ATC 


1458 
GAT CCA 


He 


Asp Lys 


Lys 


Thr Met 


Leu 


Glu Val 


Glu 


Arg Ala 


Leu 


His Lys 


He 


Asp Pro 


ACT 


1467 
ATC ATT 


CTC 


1476 
TAC GGC 


GAA 


1485 

CCG TGG 


GGT 


1494 

GGA TGG 


GGA 


1503 
GCA CCG 


ATC 


1512 
AGG TTT 


Thr 


He He 


Leu Tyr Gly Glu 


Pro Trp Gly Gly Trp Gly Ala Pro 


He 


Arg Phe 


GGA 


1521 
AAG AGO 


GAT 


1530 
GTC GCC 


GGC 


1539 
ACA CAC 


GTG 


1548 
GCA GCT 


TTC 


1557 
AAC GAT 


GAG 


1566 
TTC AGA 


Gly 


Lys Ser 


ASP 


Val Ala 


Gly 


Thr His 


Val 


Ala Ala 


Phe 


Asn Asp 


Glu 


Phe Arg 


GAC 


1575 
GCA ATA 


AGG 


1584 

GGT TCC 


GTG 


1593 
TTC AAC 


CCG 
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Th»rmotog* »aritiau* Pullulanase <«OP3> (contiautd) 

lfi29 l63B 1647 1656 1665 1674 

GGA TAC GGA AAG GAA ACC AAG ATC AAA AGG GGT GTT GTT GGA AGC ATA AAC TAC 

Gly Tyr Gly Lys Glu Thr Lys tie Lys Arg Gly Val Val Gly Ser I la Aan Tyr 

1683 1692 1701 1710 1719 1728 

GAC GGA AAA CTC ATC AAA AGT TTC GCC CTT GAT CCA GAA GAA ACT ATA AAC TAC 

Asp Gly Lys Leu lie Lys Ser Phe Ala Leu Asp Pro Glu Glu Thr He Aan Tyr 

1737 1746 1755 1764 1773 1782 

GCA GCG TGT CAC GAC AAC CAC ACA CTG TGG GAC AAG AAC TAC CTT GCC GCC AAA 

Ala Ala Cys His Asp Aan His Thx Leu Trp Asp Lys Aan Tyr Leu Ala Ala Lys 

1791 1800 1809 1818 1827 1836 

GCT GAT AAG AAA AAG GAA TGG ACC GAA GAA GAA CTG AAA AAC GCC CAG AAA CTG 

Ala Aap Lys Lys Lys Glu Trp Thr Glu Glu Glu Leu Lys Aan Ala Gin Lys Leu 

1845 1854 1863 1872 1881 1890 

GCT GOT GCG ATA CTT CTC ACT TCT CAA GGT GTT CCT TTC CTC CAC GGA GGG CAG 

Ala Gly Ala He Leu Leu Thr Ser Gin Gly Val Pro Phe Leu His Gly Gly Gin 

1899 1908 1917 1926 1935 1944 

GAC TTC TCC AGG ACG ACG AAT TTC AAC GAC AAC TCC TAC AAC GCC CCT ATC TCG 

Asp Phe Cys Arg Thr Thr Asn Phe Aan Asp Aan Ser Tyr Aan Ala Pro lie Ser 

1953 1962 1971 1980 1989 1998 

ATA AAC GGC TTC GAT TAC GAA AGA AAA CTT CAG TTC ATA GAC GTG TTC AAT TAC 

He Asn Gly Phe Asp Tyr Glu Arg Lys Leu Gin Phe He Asp Val Phe Asn Tyr 

2007 2016 2025 2034 2043 2052 

CAC AAG GGT CTC ATA AAA CTC AGA AAA GAA CAC CCT GCT TTC AGG CTG AAA AAC 

His Lys Gly Leu He Lys Leu Arg Lys Glu His Pro Ala Phe Arg Leu Lys Asn 

2061 2070 2079 2088 2097 2106 

GCT GAA GAG ATC AAA AAA CAC CTG GAA TTT CTC CCG GGC GGG AGA AGA ATA GTT 

Ala Glu Glu He Lys Lys His Leu Glu Phe Leu Pro Gly Gly Arg Arg He Val 

2115 2124 2133 2142 2151 2160 

GCG TTC ATG CTT AAA GAC CAC GCA GGT GGT GAT CCC TGG AAA GAC ATC GTG GTG 

Ala Phe Met Leu Lys Asp His Ala Gly Gly Asp Pro Trp Lys Asp He Val Val 

Figure 14 (Continued) 
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Thtxaotogt uritiu fulUluttt <60P3) (ooatlau«d) 

2169 2178 2187 2196 2205 2214 

ATT TAC AAT GGA AAC TTA GAO AAG ACA ACA TAC AAA CTG CCA GAA GGA AAA TGG 



lie Tyr Asn Gly Asn Leu Glu Lys Thr Thr Tyx Lys Leu Pro Glu Gly Lys Trp 

2223 2232 2241 2250 2259 2268 

AAT GTG GTT GTG AAC AGC CAG AAA GCC GGA ACA GAA GTG ATA GAA ACC GTC GAA 



Asn Val Val Val Asn Ser Gin Lys Ale Gly Thr Glu Val lie Glu Thr Val Glu 

2277 2286 2295 2304 2313 

GGA ACA ATA GAA CTC GAT CCG CTT TCC GCG TAC GTT CTG TAC AGA GAG TGA 3 * 



Gly Ttir lie Glu Leu Asp Pro Leu Ser Ale Tyr Val Leu Tyr Arg Glu *•• 
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